爬虫自学——爬取古诗词网唐诗三百首

from bs4 import BeautifulSoup as bs
import requests
import json

url='https://so.gushiwen.cn/gushi/tangshi.aspx'
header={
   
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"
}
fp=requests.get(url=url,headers=header)
soup=bs(fp.text,'lxml')
re=soup.select('.typecont span')
url_can=[]
for each in re:
    each_url='https://so.gushiwen.cn'+each

你可能感兴趣的:(爬虫,python,开发语言)