SourceForge.net Logo

ClientForm

ClientForm is a Python module for handling HTML forms on the client side, useful for parsing HTML forms, filling them in and returning the completed forms to the server. It developed from a port of Gisle Aas' Perl module HTML::Form, from the libwww-perl library, but the interface is not the same.

Simple example:

 from urllib2 import urlopen
 from ClientForm import ParseResponse

 forms = ParseResponse(urlopen("http://www.acme.com/form.html"))
 form = forms[0]
 print form
 form["author"] = "Gisle Aas"

 # form.click returns a urllib2.Request object
 # (see HTMLForm.click.__doc__ if you don't have urllib2)
 response = urlopen(form.click("Thanks"))

A more complicated example:

 import ClientForm
 import urllib2
 request = urllib2.Request("http://www.acme.com/form.html")
 response = urllib2.urlopen(request)
 forms = ClientForm.ParseResponse(response)
 form = forms[0]
 print form  # very useful!

 # Indexing allows setting and retrieval of control values
 original_text = form["comments"]  # a string, NOT a Control instance
 form["comments"] = "Blah."
 print form.possible_values("cheeses")
 # Controls that represent lists (checkbox, select and radio lists) are
 # ListControls, and come in two flavours: single- and multiple-selection
 # lists.  Both can take a string as a value.
 form["cheeses"] = "cheddar"  # multi
 form["favorite_cheese"] = "brie" # single
 # None is also acceptable
 form["cheeses"] = None
 # Multiple-selection lists can also take a sequence of strings
 form["cheeses"] = ["parmesan", "leicester", "cheddar"]
 # HTMLForm has some other useful methods
 form.toggle("cheeses", "gorgonzola")

 # Checkbox and radio items whose HTML has no value attribute (this is
 # often the case where a single checkbox makes up the whole checkbox
 # control) default to the value "on" (this isn't my ugly hack, it's the
 # browser manufacturers'), so to check such a control:
 form["deeppan"] = "on"  # ["on"] would also work for a checkbox
 # and to un-check
 form["deeppan"] = None  # [] would also work for a checkbox

 # The find_control method allows access to the contained Control objects
 # that represent the textareas, checkbox lists, etc, etc.  In the case of
 # lists, Controls may correspond to multiple HTML elements.
 control = form.find_control(name="cheeses")
 print control.value  # equivalent to form["cheeses"].value
 # The type and nr arguments to find_control also allow more precision than
 # indexing.
 control = form.find_control(name="cheeses", type="select", nr=1)
 print control.name, control.type
 assert control.multiple
 # All Controls may be disabled (equivalent of greyed-out in browser)
 assert not control.disabled
 # TextControls may be readonly
 assert not form.find_control("comments").readonly

 # Controls also have methods on them that are useful for doing more
 # obscure things -- these two are equivalent:
 assert control.type == "select", \
     "only SelectControl has toggle_by_label method"
 control.toggle("gorgonzola")
 control.toggle_by_label(["NEW!  Special Offer on Gorgonzola"])

 request2 = form.click("Submit")  # urllib2.Request object
 response2 = urllib2.urlopen(request2)

 print response2.geturl()
 print response2.info()  # headers
 for line in response2.readlines():  # body
     print line

All of the standard control types are supported: TEXT, PASSWORD, HIDDEN, TEXTAREA, ISINDEX, RESET, BUTTON, SUBMIT, IMAGE, RADIO, CHECKBOX, SELECT/OPTION. FILE (for file upload) is not supported in the 0.0.x version.

The module is designed for testing and automation of web interfaces, not for implementing interactive user agents.

Security note: Remember that any passwords you store in HTMLForm instances will be saved to disk in the clear if you pickle them (directly or indirectly). The simplest solution to this is to avoid pickling HTMLForm objects.

Python 1.5.2 or above is required. To run the tests, you need the unittest module (from PyUnit). unittest is a standard library module with Python 2.1 and above.

For full documentation, see the docstrings in ClientForm.py.

Note: this page describes the 0.0.x interface. See here for the old 0.0.x interface.

Download

For installation instructions, see the INSTALL file included in the distribution.

Stable release.


Old release.

FAQs

John J. Lee, December 2003.