## Accumulative Functions

These functions start by initializing some storage, and iterate over input to build it up, before returning some final object (a large structure or aggregated result). A standard way to do this is to initialize an empty list, accumulate the material, then return the list, as shown in function searchl() in Example 4-5.

Example 4-5. Accumulating output into a list.

def searchl(substring, words): result = [] for word in words:

if substring in word: result.append(word) return result def search2(substring, words): for word in words:

if substring in word: yield word print "search1:"

for item in search1('zz', nltk.corpus.brown.words()):

print item print "search2:"

for item in search2('zz', nltk.corpus.brown.words()): print item

The function search2() is a generator. The first time this function is called, it gets as far as the yield statement and pauses. The calling program gets the first word and does any necessary processing. Once the calling program is ready for another word, execution of the function is continued from where it stopped, until the next time it encounters a yield statement. This approach is typically more efficient, as the function only generates the data as it is required by the calling program, and does not need to allocate additional memory to store the output (see the earlier discussion of generator expressions).

Here's a more sophisticated example of a generator which produces all permutations of a list of words. In order to force the permutations() function to generate all its output, we wrap it with a call to list() O.

>>> def permutations(seq): if len(seq) <= 1:

yield seq else:

for perm in permutations(seq[1:]): for i in range(len(perm)+1):

>>> list(permutations(['police', 'fish', 'buffalo'])) O [['police', 'fish', 'buffalo'], ['fish', 'police', 'buffalo'], ['fish', 'buffalo', 'police'], ['police', 'buffalo', 'fish'], buffalo', 'police', 'fish'], ['buffalo', 'fish', 'police']]

The permutations function uses a technique called recursion, discussed later in Section 4.7. The ability to generate permutations of a set of words is useful for creating data to test a grammar (Chapter 8).