urllib2 爬虫

Python中的Urllib2

https://docs.python.org/2/library/urllib2.html

发起GET请求

http://kaoshi.edu.sina.com.cn/college/scorelist?tab=batch&wl=1&local=2&batch=&syear=2013


request = urllib2.Request(url=url, headers=headers)

response = urllib2.urlopen(request, timeout=20)

result = response.read()


发起POST请求

http://shuju.wdzj.com/plat-info-59.html


data = urllib.urlencode({'type1': x, 'type2': 0, 'status': 0, 'wdzjPlatId': int(platId)})

request = urllib2.Request('http://shuju.wdzj.com/depth-data.html', headers)

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())

response = opener.open(request, data)

result = response.read()


处理返回结果

Html:BeautifulSoup,需要有一些CSS基础

API:JSON

https://www.crummy.com/software/BeautifulSoup/bs4/doc/

你可能感兴趣的:(urllib2 爬虫)