Trees

A tree is a set of connected labeled nodes, each reachable by a unique path from a distinguished root node. Here's an example of a tree (note that they are standardly drawn upside-down):

NP VP

Alice V NP

chased the rabbit

We use a 'family' metaphor to talk about the relationships of nodes in a tree: for example, S is the parent of VP; conversely VP is a child of S. Also, since NP and VP are both children of S, they are also siblings. For convenience, there is also a text format for specifying trees:

(V chased)

Although we will focus on syntactic trees, trees can be used to encode any homogeneous hierarchical structure that spans a sequence of linguistic forms (e.g., morphological structure, discourse structure). In the general case, leaves and node values do not have to be strings.

In NLTK, we create a tree by giving a node label and a list of children:

(NP Alice)

>>> tree2 = nltk.Tree('NP', ['the', 'rabbit'])

(NP the rabbit)

We can incorporate these into successively larger trees as follows:

>>> tree3 = nltk.Tree('VP', ['chased', tree2]) >>> tree4 = nltk.Tree('S', [treel, tree3]) >>> print tree4

(S (NP Alice) (VP chased (NP the rabbit))) Here are some of the methods available for tree objects:

(VP chased (NP the rabbit))

'rabbit'

The bracketed representation for complex trees can be difficult to read. In these cases, the draw method can be very useful. It opens a new window, containing a graphical representation of the tree. The tree display window allows you to zoom in and out, to collapse and expand subtrees, and to print the graphical representation to a postscript file (for inclusion in a document). >>> tree3.draw()

the man

Was this article helpful?

0 0

Post a comment