Csv

Language analysis work often involves data tabulations, containing information about lexical items, the participants in an empirical study, or the linguistic features extracted from a corpus. Here's a fragment of a simple lexicon, in CSV format:

sleep, sli:p, v.i, a condition of body and mind ...

walk, wo:k, v.intr, progress by lifting and setting down each foot ...

wake, weik, intrans, cease to sleep

We can use Python's CSV library to read and write files stored in this format. For example, we can open a CSV file called lexicon.csv O and iterate over its rows ©: >>> import csv

>>> input_file = open("lexicon.csv", "rb") O >>> for row in csv.reader(input_file): © ... print row

['sleep', 'sli:p', 'v.i', 'a condition of body and mind ...']

['walk', 'wo:k', 'v.intr', 'progress by lifting and setting down each foot ...'] ['wake', 'weik', 'intrans', 'cease to sleep']

Each row is just a list of strings. If any fields contain numerical data, they will appear as strings, and will have to be converted using int() or float().

Figure 4-5. Visualization with NetworkX and Matplotlib: Part of the WordNet hypernym hierarchy is displayed, starting with dog.n.01 (the darkest node in the middle); node size is based on the number of children of the node, and color is based on the distance of the node from dog.n.01; this visualization was produced by the program in Example 4-11.

Figure 4-5. Visualization with NetworkX and Matplotlib: Part of the WordNet hypernym hierarchy is displayed, starting with dog.n.01 (the darkest node in the middle); node size is based on the number of children of the node, and color is based on the distance of the node from dog.n.01; this visualization was produced by the program in Example 4-11.

Was this article helpful?

0 0

Post a comment