A tree is a set of connected labeled nodes, each reachable by a unique path from a distinguished root node. Here's an example of a tree (note that they are standardly drawn upside-down):
Alice V NP
chased the rabbit
We use a 'family' metaphor to talk about the relationships of nodes in a tree: for example, S is the parent of VP; conversely VP is a child of S. Also, since NP and VP are both children of S, they are also siblings. For convenience, there is also a text format for specifying trees:
Although we will focus on syntactic trees, trees can be used to encode any homogeneous hierarchical structure that spans a sequence of linguistic forms (e.g., morphological structure, discourse structure). In the general case, leaves and node values do not have to be strings.
In NLTK, we create a tree by giving a node label and a list of children:
>>> tree2 = nltk.Tree('NP', ['the', 'rabbit'])
(NP the rabbit)
We can incorporate these into successively larger trees as follows:
>>> tree3 = nltk.Tree('VP', ['chased', tree2]) >>> tree4 = nltk.Tree('S', [treel, tree3]) >>> print tree4
(S (NP Alice) (VP chased (NP the rabbit))) Here are some of the methods available for tree objects:
(VP chased (NP the rabbit))
The bracketed representation for complex trees can be difficult to read. In these cases, the draw method can be very useful. It opens a new window, containing a graphical representation of the tree. The tree display window allows you to zoom in and out, to collapse and expand subtrees, and to print the graphical representation to a postscript file (for inclusion in a document). >>> tree3.draw()
Was this article helpful?