Retrieving a Web Page

In its simplest form, the page-retrieving function can be implemented with only two function calls, and in most cases where you're not submitting any information and just retrieving, this is sufficient. The following example uses the urlopen () function, which performs an HTTP GET request if no additional form data is supplied. We'll look at different methods of submitting data to the web applications later in the chapter.

>>> import urllib2

>>> r = urllib2.urlopen('http://news.bbc.co.uk') >>> html = r.read() >>> len(html)

The result of the read() call is a string containing the web page as it is served by the server. This string, however, is not a full response and does not include extra information such as HTTP protocol headers. The result object returned by the urlopen () call has the info() method, which you can use to retrieve the HTTP headers as they are returned by the server. You need to remember that the object returned by the info() call is an instance of the httplib. HTTPMessage class, which implements the same protocol as the dictionary class, but in fact is not a dictionary itself:

Was this article helpful?

0 0

Post a comment