Shift Reduce Parsing

A simple kind of bottom-up parser is the shift-reduce parser. In common with all bottom-up parsers, a shift-reduce parser tries to find sequences of words and phrases that correspond to the righthand side of a grammar production, and replace them with the lefthand side, until the whole sentence is reduced to an S.

The shift-reduce parser repeatedly pushes the next input word onto a stack (Section 4.1); this is the shift operation. If the top n items on the stack match the n items on the righthand side of some production, then they are all popped off the stack, and the item on the lefthand side of the production is pushed onto the stack. This replacement of the top n items with a single item is the reduce operation. The operation may be applied only to the top of the stack; reducing items lower in the stack must be done before later items are pushed onto the stack. The parser finishes when all the input is consumed and there is only one item remaining on the stack, a parse tree with an S node as its root. The shift-reduce parser builds a parse tree during the above process. Each time it pops n items off the stack, it combines them into a partial parse tree, and pushes this back onto the stack. We can see the shift-reduce parsing algorithm in action using the graphical demonstration Six stages of the execution of this parser are shown in Figure 8-5.

1. Initial state


Remaining Te*l

lie be

» saw a rnirt in H>t paifc

3. After reduce shift reduce


Remaining Text

Del N

saw a man in me paifc

Bit dqq

5, After building

a complex NP


ReinflliU»c| lam

Uk ft m

2. After one shift


RernniMitg Teii

MP V MP «1

Ihe par*

iDi Del N

it« Ag

• m«i

6. Built a complete parse tree




J 1 L

fits LW



1 1

1 /X

n OH (t 1 1

Ihfl poll*

Figure 8-5. Six stages of a shift-reduce parser: The parser begins by shifting the first input word onto its stack; once the top items on the stack match the righthand side of a grammar production, they can be replaced with the lefthand side of that production; the parser succeeds once all input is consumed and one S item remains on the stack.

NLTK provides ShiftReduceParser(), a simple implementation of a shift-reduce parser. This parser does not implement any backtracking, so it is not guaranteed to find a parse for a text, even if one exists. Furthermore, it will only find at most one parse, even if more parses exist. We can provide an optional trace parameter that controls how verbosely the parser reports the steps that it takes as it parses a text:

>>> sr_parse = nltk.ShiftReduceParser(grammar1) >>> sent = 'Mary saw a dog'.split() >>> print sr_parse.parse(sent)

Your Turn: Run this parser in tracing mode to see the sequence of shift and reduce operations, using sr_parse = nltk.ShiftReduceParser(gram mar1, trace=2).

A shift-reduce parser can reach a dead end and fail to find any parse, even if the input sentence is well-formed according to the grammar. When this happens, no input remains, and the stack contains items that cannot be reduced to an S. The problem arises because there are choices made earlier that cannot be undone by the parser (although users of the graphical demonstration can undo their choices). There are two kinds of choices to be made by the parser: (a) which reduction to do when more than one is possible and (b) whether to shift or reduce when either action is possible.

A shift-reduce parser may be extended to implement policies for resolving such conflicts. For example, it may address shift-reduce conflicts by shifting only when no reductions are possible, and it may address reduce-reduce conflicts by favoring the reduction operation that removes the most items from the stack. (A generalization of the shift-reduce parser, a "lookahead LR parser," is commonly used in programming language compilers.)

The advantages of shift-reduce parsers over recursive descent parsers is that they only build structure that corresponds to the words in the input. Furthermore, they only build each substructure once; e.g., NP(Det(the), N(man)) is only built and pushed onto the stack a single time, regardless of whether it will later be used by the VP -> V NP PP reduction or the NP -> NP PP reduction.

0 0

Post a comment