Xmlrpc Quick Start Get Tech News from Meerkat

Meerkat is a public web application that aggregates technology news from hundreds of weblogs and news sites. It was one of the first web applications to expose a web service interface: first a RESTlike interface and then an XML-RPC interface.

Meerkat's XML-RPC interface is described at www.oreillynet.com/pub/a/rss/2 0 00/11/14/ meerkat_xmlrpc.html.

Meerkat's API exposes access to three types of objects: channels (weblogs and news sites), categories (groupings of channels), and items (stories published by channels). Unfortunately, to use any of the functions that deal with channels or categories, you must do some legwork ahead of time to ascertain the numeric channel or category IDs. The most generally useful method is therefore getltems, a search function that tries to match your search criteria against Meerkat's database of recently posted news items.

Here's a simple script, MeerkatSummary.py, that takes a search criterion as input and determines which Meerkat channels have the most stories that match the search:

import xmlrpclib class MeerkatSummary:

"""Lists channels that match a search term, in order of how many stories match."""

SERVER_URL = 'http://www.oreillynet.com/meerkat/xml-rpc/server.php'

"Set up a reference to the Meerkat server."

#Passing 'verbose=True' to the server constructor will make it #print the text of the request and response for each XML-RPC #call, letting you see the internal workings of the protocol. #verbose = True verbose = False server = xmlrpclib.ServerProxy(self.SERVER_URL, verbose=verbose) self.meerkat = server.meerkat def findChannels(self, searchTerm):

"Given a search term, find out which channels have the most hits." channelTotals = {}

items = self.meerkat.getItems({'search' : searchTerm,

for item in items:

channel = item['channel']

totalForChannel = channelTotals.get(channel, 0) totalForChannel += 1

channelTotals[channel] = totalForChannel #Turn the map into a list of (matches, channel name) tuples, and sort it. totalAndChannel = [(a,b) for b,a in channelTotals.items()] totalAndChannel.sort() totalAndChannel.reverse()

print 'Meerkat report for "%s":' % searchTerm for total, channel in totalAndChannel: print "%2d %s" % (total, channel)

The actual web service call is self.meerkat.getltems, on the third line of MeerkatSummary. findChannels. If you blink you'll miss it, because as far as Python is concerned, it's just another method call—albeit one that's implemented differently than a local method call. xmlrpclib defines a_call_

method for ServerProxy that handles the XML-RPC for getltems.

The previous section's WishListBargainFinder also hid the complexity of a web service behind a standard Python method: In that case, it was amazon.searchWishList that activated the REST web service. The difference is that someone had to write a Python method called "searchWishList" that made an AWS-specific REST request and processed the AWS-specific response. The getltems method is handled by xmlrpclib — there's no special code for dealing with the Meerkat XML-RPC server, no need for an actual Python method called getltems:

if _name_ == '_main_':

import sys

if len(sys.argv) != 2:

print "Usage: %s [search term]" %

sys.argv[0]

sys.exit(l)

else:

MeerkatSummary().findChannels(sys.

.argv[1])

Run the script, and you'll see a variety of news channels that have mentioned Python:

$ python MeerkatSummary.py Python Meerkat report for "Python": 22 Freshmeat Daily News 10 Python URL (daily updates) 8 Vaults of Parnassus

2 Python Cookbook 1 Zope org

1 SourceForge Project News 1 Python org latest headlines 1 NetBSD Packages 1 Linux Weekly News 1 Linux Today 1 IceWalkers 1 Developer Shed

James Joyce, who got lots of results from an Amazon Web Services query, doesn't do as well here:

$ python MeerkatSummary.py "James Joyce" Meerkat report for "James Joyce": 1 kottke.org 1 Beyond the Beyond

Note that Meerkat's getltems method will never return more than 50 results; and unlike with Amazon Web Services, there's no pagination interface that lets you get more. Any script you write that runs against Meerkat will be limited to delving into the recent past.

The XML-RPC Request

The XML-RPC request body is the body of an HTTP POST request. It's an XML document containing a methodCall element. The methodCall element contains two elements of its own: methodName, which designates the method to be called; and params, which contains a list of the parameters to be passed as arguments into the method.

Here's a sample XML-RPC request for a hypothetical method that sorts a list of numbers in either ascending or descending order:

<?xml version="1.0"?> <methodCall> <methodName>searchsort.sortList</methodName> <params> <param> <value> <array> <data>

<value><i4>10</i4></value> <value><i4>2</i4></value> </data> </array> </param>

<param><value><boolean>1</boolean></param> </params> </methodCall>

This is the XML-RPC equivalent of invoking a hypothetical local method with the following code: import searchsort searchsort.sortList([10, 2], True)

Given what you know about xmlrpclib, it's no surprise that this method request would be generated and POSTed when you ran code like this:

import xmlrpclib xmlrpclib.ServerProxy("http://sortserver/RPC").searchsort.sortList([10, 2], True)

Was this article helpful?

0 0