爬虫GZIP解决方法

from urllib import request,parse
import gzip
headers = {
	'Host': 'www.hazq.com',
	'Connection': 'keep-alive',
	#'Content-Length': '37',
	'Accept': 'application/json, text/javascript, */*; q=0.01',
	'X-Requested-With': 'XMLHttpRequest',
	'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36',
	'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
	'Origin': 'http://www.hazq.com',
	'Referer': 'http://www.hazq.com/hazq/mall/jhlc.html?classid=0001000100120002',
	'Accept-Encoding': 'gzip, deflate',
	'Accept-Language': 'zh-CN,zh;q=0.9'
}
url = 'http://www.hazq.com/hazq/mall/public/zkjhlc_bak.jsp?qs=0&par_num=00&pageIndex=1&par_ord=1'
req = request.Request(url = url,headers = headers)
data  = request.urlopen(req).read()
html = gzip.decompress(data).decode("utf-8") #gzip
print(html)

比如res打印出来是这个b’\x1f\x8b\x08\x00\x00\x00\x00\x00…’
完成解码
爬虫GZIP解决方法_第1张图片

你可能感兴趣的:(python,爬虫)