SourceForge.net Logo

mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .

An example:

import re
from mechanize import Browser

b = Browser()
b.open("http://www.example.com/")
# follow second link with element text matching regular expression
response = b.follow_link(text_regex=re.compile(r"cheese\s*shop"), nr=1)
assert b.viewing_html()
print b.title()
print response.geturl()
print response2.info()  # headers
print response2.read():  # body
response.close()

b.select_form(name="order")
# Browser passes through unknown attributes (including methods)
# to the selected HTMLForm (from ClientForm).
b["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__)
response2 = b.submit()  # submit current form

response3 = b.back()  # back to cheese shop
# the history mechanism uses cached requests and responses
assert response3 is response
# we can still use the response, even though we closed it:
response3.seek(0)
response3.read()
response4 = b.reload()
assert response4 is not response3

for link in b.forms():
    print form
# .links() optionally accepts the keyword args of .follow_/.find_link()
for link in b.links(url_regex=re.compile("python.org")):
    print link

Full documentation is in the docstrings.

Thanks to Ian Bicking, for persuading me that a UserAgent class would be useful.

Todo

Download

All documentation (including this web page) is included in the distribution.

This is an alpha release: interfaces may change, and there will be bugs.

Development release.

For installation instructions, see the INSTALL file included in the distribution.

FAQs

John J. Lee, December 2003.