Automatic site login using Python urllib2
The urllib2 library in Python is very powerful. It can be used is to automate web interactions. A common task that keeps coming up is to automate the process of logging in to a site by POST-ing the login credentials, getting any auth cookies returned, and then sending those cookies back with subsequent requests. urllib2 makes this process quite trivial.
Here is a very concise example of how this can be done:
# build opener with HTTPCookieProcessor
o = urllib2.build_opener( urllib2.HTTPCookieProcessor() )
urllib2.install_opener( o )
# assuming the site expects 'user' and 'pass' as query params
p = urllib.urlencode( { 'username': 'me', 'password': 'mypass' } )
# perform login with params
f = o.open( 'http://www.mysite.com/login/', p )
data = f.read()
f.close()
# second request should automatically pass back any
# cookies received during login... thanks to the HTTPCookieProcessor
f = o.open( 'http://www.mysite.com/protected/area/' )
data = f.read()
f.close()
Notice the use of the HTTPCookieProcessor class when building the opener. This class automatically handles storing any cookies sent by the server and then sending them back automatically on subsequent requests.
With the first open() method call, the login credentials are sent to a site’s login URL. Any cookies returned during the auth process get handled automatically by the HTTPCookieProcessor class and stored.
On the second open() method call to request a protected URL, any cookies that were stored by the HTTPCookieProcessor are resent to the server and everything works transparently. Its that simple!
“pass” creates a syntax error since “pass” is used for something else in python
Thanks Adam. I have corrected the snippet. I was kind of in a hurry when I posted this so I totally missed this one! :)
Thanks for the clue!
One remark:
‘username’ and ‘password’ keys should be provided, instead of ‘user’ and ‘pass’ in #6 line of the snippet above (at least in django 1.0.2 generic login handler)
urllib2 is okay for scraping but for site logins and stuff, it sucks because it cannot handle basic redirects either automatically or using redirect_handler. You’re better off using pycurl.
hello all..
im doing research project on openid in python.can u tell me what should i do for automaticallt redirect my openid url to openid server and check authentication itself and send beck to me confirmation of success in login.
do u have any logic behind it….share wid me…