大概就是爬取codeforces的题目的答案 然后再交上去以达到刷题的目的
Codeforces是一家为计算机编程爱好者提供在线评测系统的俄罗斯网站。该网站由萨拉托夫国立大学的一个团体创立并负责运营—百度百科
在这个网站中 你可以阅读题目(算法题)然后上传代码,评测姬给你反馈个评测结果(AC,WA…)
这个代码主要是先登录,获取cookie,方便以后免登录提交代码(提交代码需要登录) --login函数实现
然后进入比赛界面
contest1
contest1141
因为每个比赛的题目和题号都可能有差异 所有需要请求contest的总网页来获取该contest下的所有题目–solve实现
然后就是获取AC代码cf的比赛有个特点,就是可以看别人的代码,通过post请求特定的代码,比如上面这张图就是contest1141的A题,c++11,ac的代码,然后获取这样一个表,找到任意一个AC的代码(我就找第一个)–getcode实现
这就是一个AC代码
然后在把这个代码交上去就行了—uploadcode实现
cf对个人的代码有查重机制(暂时只能查完全一样的),对不同用户之间没有这个机制,
为了避免被查重 在原有的code的基础上,在后面加了个//hello然后上传代码(写过c++代码的应该知道//是注释,不会影响代码)
这里最难的地方是 请求头中’csrf_token’这个的值,这是网页为了识别爬虫而设计的大概就是系统设置个随机数,在客户端请求时发给用户(隐藏在html中),如果下次请求中不带上就返回403,访问受限(服务器知道你要干啥,但不想搭理你)在这里卡了一天,还找错方向了…
代码:
res = s.get('http://codeforces.com/enter?back=%2F')
soup=BeautifulSoup(res.text,'lxml')
csrf_token=soup.find(attrs={'name' : 'X-Csrf-Token'}).get('content')
form_data={
'csrf_token' : csrf_token,
'action' : 'enter',
'ftaa' : '',
'bfaa' : '',
'handleOrEmail' : name,
'password' : password,
'remember' : []
}
而且这个需要每次请求时都需要更新…
# coding=utf-8
"""
顾名思义 我是个爬虫
作用: 爬取codeforces的代码并自动提交
作者: 大哥
"""
import random
import re
import time
import threading
import requests
from bs4 import BeautifulSoup
from lxml import etree
#用户名
name='WOSHIGEPACHONG2'
#密码
password='你以为我会告诉你吗?哈哈哈哈'
user_agent = [
'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 '
'Safari/534.50',
'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50',
'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0',
'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; '
'.NET CLR 3.5.30729; InfoPath.3; rv:11.0) like Gecko',
'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)',
'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)',
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1',
'Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1',
'Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11',
'Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 '
'Safari/535.11',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon 2.0)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; TencentTraveler 4.0)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; The World)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SE 2.X MetaSr 1.0; SE 2.X MetaSr 1.0; .NET CLR '
'2.0.50727; SE 2.X MetaSr 1.0)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; 360SE)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser)',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)',
'Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) '
'Version/5.0.2 Mobile/8J2 Safari/6533.18.5',
'Mozilla/5.0 (iPod; U; CPU iPhone OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) '
'Version/5.0.2 Mobile/8J2 Safari/6533.18.5',
'Mozilla/5.0 (iPad; U; CPU OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 '
'Mobile/8J2 Safari/6533.18.5',
'Mozilla/5.0 (Linux; U; Android 2.3.7; en-us; Nexus One Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) '
'Version/4.0 Mobile Safari/533.1',
'MQQBrowser/26 Mozilla/5.0 (Linux; U; Android 2.3.7; zh-cn; MB200 Build/GRJ22; CyanogenMod-7) AppleWebKit/533.1 ('
'KHTML, like Gecko) Version/4.0 Mobile Safari/533.1',
'Opera/9.80 (Android 2.3.4; Linux; Opera Mobi/build-1107180945; U; en-GB) Presto/2.8.149 Version/11.10',
'Mozilla/5.0 (Linux; U; Android 3.0; en-us; Xoom Build/HRI39) AppleWebKit/534.13 (KHTML, like Gecko) Version/4.0 '
'Safari/534.13',
'Mozilla/5.0 (BlackBerry; U; BlackBerry 9800; en) AppleWebKit/534.1+ (KHTML, like Gecko) Version/6.0.0.337 Mobile '
'Safari/534.1+',
'Mozilla/5.0 (hp-tablet; Linux; hpwOS/3.0.0; U; en-US) AppleWebKit/534.6 (KHTML, like Gecko) wOSBrowser/233.70 '
'Safari/534.6 TouchPad/1.0',
'Mozilla/5.0 (SymbianOS/9.4; Series60/5.0 NokiaN97-1/20.0.019; Profile/MIDP-2.1 Configuration/CLDC-1.1) '
'AppleWebKit/525 (KHTML, like Gecko) BrowserNG/7.1.18124',
'Mozilla/5.0 (compatible; MSIE 9.0; Windows Phone OS 7.5; Trident/5.0; IEMobile/9.0; HTC; Titan)',
'UCWEB7.0.2.37/28/999',
'NOKIA5700/ UCWEB7.0.2.37/28/999',
'Openwave/ UCWEB7.0.2.37/28/999',
'Mozilla/4.0 (compatible; MSIE 6.0; ) Opera/UCWEB7.0.2.37/28/999', ]
s=requests.session()
#登录
def login():
agent=random.choice(user_agent)
header={'User-Agent' : agent}
s.headers.update(header)
try:
res = s.get('http://codeforces.com/enter?back=%2F')
soup=BeautifulSoup(res.text,'lxml')
csrf_token=soup.find(attrs={'name' : 'X-Csrf-Token'}).get('content')
form_data={
'csrf_token' : csrf_token,
'action' : 'enter',
'ftaa' : '',
'bfaa' : '',
'handleOrEmail' : name,
'password' : password,
'remember' : []
}
s.post('http://codeforces.com/enter',data=form_data)
except Exception as e:
print('登陆失败',e)
#获取代码
def getcode(a,b) :
try:
res=s.get('http://codeforces.com/problemset/submit')
soup=BeautifulSoup(res.text,'lxml')
csrf_token=soup.find(attrs={'name' : 'X-Csrf-Token'}).get('content')
data={
'csrf_token' : csrf_token,
'action' : 'setupSubmissionFilter',
'frameProblemIndex' : b,
'verdictName' : 'OK',
'programTypeForInvoker' : 'cpp.g++11',
'comparisonType' : 'NOT_USED',
'judgedTestCount' : '',
}
s.post('https://codeforces.com/contest/'+a+'/status',data=data)
res=s.get('https://codeforces.com/contest/'+a+'/status')
links=re.findall('submission/(.+?)"',res.text)
if len(links)<=0 :
return False
res2=s.get('https://codeforces.com/contest/'+a+'/submission/'+links[0])
selector=etree.HTML(res2.text)
out=selector.xpath('//*[@id="program-source-text"]')[0]
except Exception as e:
print('题号:',a+b,'获取代码失败')
print(e)
exit(0)
return out.text
def uploadcode(a,b,code) :
res=s.get('http://codeforces.com/problemset/submit')
soup=BeautifulSoup(res.text,'lxml')
csrf_token=soup.find(attrs={'name' : 'X-Csrf-Token'}).get('content')
post_data={
'csrf_token' : csrf_token,
'ftaa' : '',
'bfaa' : '',
'action' : 'submitSolutionFormSubmitted',
'submittedProblemCode' : a+b,
'programTypeId' : '42',
'source' : code+'//hello',
'tabSize' : 0,
'sourceFile' : '',
}
res=s.post('http://codeforces.com/problemset/submit?csrf_token='+csrf_token,data=post_data)
if res.status_code!=200 :
print('题号:',a+b,'提交代码失败')
print(res)
exit(0)
else:
print('题号:',a+b,'AC!!!')
def solve(a) :
global s
html=s.get('https://codeforces.com/contest/'+a).text
if not html :
return
links=re.findall('+a+'/problem/(.+?)">