Working with xmldomminidom

The xml.dom.minidom module is designed to help you work with XML using the DOM approach. However, this module is far from complete in IronPython, partly due to the CPython support required in standard Python. The actual document support is complete, so you won't have a problem building, editing, and managing XML documents. It's the write and read support that are lacking.

Fortunately, you can overcome write issues by using a different approach to outputting the document to disk (or other media). Standard Python development practice is to use the xml.dom.ext .PrettyPrint() method, which simply doesn't exist in IronPython. You get around the problem by performing the task in two steps, rather than one, as shown in Listing 13-3.

The reading problem isn't as easy to solve. Standard Python development practice is to use the xml .dom.minidom.parse() method. This method does exist in IronPython, but it outputs an error stating

ImportError: No module named pyexpat

This module actually is missing. In order to fix this problem, you must download the pyexpat. py file from Place this file in your \Program Files\IronPython 2.6\Lib,not the\Program Files\IronPython 2.6\Lib\ xml\dom folder as you might think. As shown in Listing 13-3, the standard Python techniques work just fine now.

LISTING 13-3: Managing XML documents using the Python approach

Available for # Import the required XML support. *'wr0x°C0mn import xml.dom.minidom def CreateDocument():

# Create an XML document.

Doc = xml.dom.minidom.Document()

# Create the root node.

Root = Doc.createElement('root')

# Add the message nodes.

MsgNode = Doc.createElement('Message') Message = Doc.createTextNode('Hello') MsgNode.appendChild(Message)

LISTING 13-3 (continued)


MsgNode = Doc.createElement('Message') Message = Doc.createTextNode('Goodbye') MsgNode.appendChild(Message) Root.appendChild(MsgNode)

# Append the root node to the document. Doc.appendChild(Root)

# Create the output document. MyFile = open('Test2.XML', 'w')

# Write the output.


# Close the document. MyFile.close()

def DisplayDocument():

# Read the existing XML document.

XMLDoc = xml.dom.minidom.parse('Test2.XML')

# Print the message node content.

for ThisChild in XMLDoc.getElementsByTagName('Message'):

print 'Message:', ThisChild.firstChild.toxml().strip('\n\t')

CreateDocument() DisplayDocument()

# Pause after the debug session. raw_input('\nPress any key to continue...')

The first thing you should notice is that the code for this example is much shorter than its .NET counterpart, even though the result is essentially the same. Despite the problems with the Python libraries, you can write concise code for manipulating XML using Python.

The code begins by importing the only module it needs, xml.dom.minidom. It then calls CreateDocument() and DisplayDocument() in turn, just as the .NET example does. In fact, the output from this example is precisely the same. You see the same output shown in Figure 13-2 when you run this example.

The CreateDocument() function begins by creating an XML document, Doc, using xml.dom .minidom.Document(). The XML document automatically contains the XML declaration, so unlike the .NET version of the code, you don't need to add it manually. So the first processing task is to create the root node using Doc.createElement('root').

As with the .NET example, this example creates two MsgNode elements that contain different messages. The technique used is different from the .NET example. Instead of setting an InnerXml property, the code creates an actual text node using Doc.createTextNode(). However, the result is the same, as shown in Figure 13-6. The last step is to add Root to Doc using Doc.appendChild().

A big difference between IronPython and Python is how you write the XML to a file. As previously mentioned, you can't use the xml.dom.ext.PrettyPrint() method. In this case, the code creates a file, MyFile, using open(). The arguments define the filename and the mode, where 'w' signifies write. In order to write the text to a file, you use a two-step process. First, the code creates formatting XML by calling Doc.toprettyxml(). The function accepts an optional encoding argument, but there isn't any way to define the resulting XML document as stand-alone using the standalone="yes" attribute (see Figure 13-1). Second, the code writes the data to the file buffer using MyFile.write().

Calling MyFile.write() doesn't write the data to disk. In order to clear the file buffer, you must call MyFile.close(). Theoretically, IronPython will call MyFile.close() when the application ends, but there isn't any guarantee of this behavior, so you must specifically call MyFile.close() to ensure there isn't any data loss.

The DisplayDocument() function comes next. Reading an XML document from disk and placing it in a variable is almost too easy when using IronPython. All you need to do is make a single call to xml.dom.minidom.parse(). That's it! The document is immediately ready for use.

The second step is to display the same output shown in Figure 13-2. Again, all you need in IronPython is a simple for loop, rather than the somewhat lengthy .NET code. In this case, you ask IronPython to retrieve the nodes you want using XMLDoc.getElementsByTagName(). The output is a list that you can process one element at a time. The print statement calls on a complex-looking call sequence.


However, if you take this call sequence apart, it really isn't all that hard to understand. Every iteration of the loop places one of the MsgNode elements in ThisChild. The first (and only) child of MsgNode is the Message text node, so you can retrieve it using the firstChild property. The firstChild property contains a DOM Text node object, so you convert it to XML using the toxml() method. Unfortunately, the resulting string contains control characters, so you remove them using the strip('\n\t') method. The result is a simple value output.

Was this article helpful?

0 0


  • aino
    How to solve models proble b/w ironpyton and django?
    8 years ago

Post a comment