In trying to write some code to use the new Technorati API, I noticed that all the tools for accessing XML documents sucked. So I wrote my own: xmltramp. It makes handling XML documents in Python a piece of cake:

>>> d = load(URL)
>>> d
<document>
>>> d._dir
[<result>, <item>, <item>, ...]
>>> str(d.document.weblog.name)
"Internet Alchemy"

This was really fun code to write. I thought of the ideal UI for using XML documents, wrote up an example interactive session, and then implemented it. First I wrote the XML objects and then I wrote a SAX parser to build them. The whole thing is under 80 lines (sans tests). The whole thing took an hour and a half.

Python geeks: Do you have any suggestions on a better way to get output from a SAX parser? Right now I’m having it raise an exception with the object I want as the payload. This seems pretty klugey. Is there a better way? Update: Version 1.1 now passes the result through an attribute in the ContentHandler.

I’d love to hear your thoughts. There are some obvious features to be added (serialization, for example) but I think this does enough for now. Time to go use it!

posted May 12, 2003 01:28 PM (Technology) #

Nearby

Illinois’s state “super-DMCA”
Locked in the Trunk
Escaping the Trunk
Cox News
Media Concentration
xmltramp
Matrix Unloaded
MediaCon: Lobbying
1201 Hearings
Joss Quest
RFC: Annals of Planet Hacking

Aaron Swartz (me@aaronsw.com)