If a good name is one that helps you understand what the named object is doing, then there are some good names and some not-so-good names in spider.py. An early draft of our program had functions named get_links() and find_links(). Those names don't really make clear the differences between the two functions, so we renamed get links() to get page().
Tip Programmers sometimes choose terse and not very explanatory names on purpose to indicate that you shouldn't pay much attention to the name because it's just a temporary name used to convey information (for example, it's used as an argument to a function or method). Sometimes a temporary name makes a few lines of code easier to read. Take, for example, the lines of code that use the name f in the find_links() function:
f = formatter.AbstractFormatter(writer) parser = htmllib.HTMLParser(f)
We could have gotten rid of f by writing the code this way instead:
parser = htmllib.HTMLParser(formatter.AbstractFormatter(writer))
But that's kind of long and hard to read, so we decided it was better to split the lines and use a temporary name.
REMEMBER Give users of your modules information about which attributes, functions, classes, and methods they should avoid accessing directly, passing to other functions, subclassing, or rewriting. (Or, more colloquially, "Das ist nicht für gefingerpoken!") Sometimes this information is conveyed by using a single underscore character as the first character in a name, which means the object is private. (See Chapter 13 for more about private attributes.) For example, we chose to make seif._iinks_to_process a private name because it's valid only inside the Spider class. We could have made uri_in_site() a private name for the same reason, but we didn't in order to send the message that it's suitable for overriding in a subclass.
Was this article helpful?