2019独角兽企业重金招聘Python工程师标准>>>
问题描述:遍历item时,出现keyerror,RepAskName错误
for index in item['RepAskName']: lastRlist.append([item['RepAskName'][index], item['RepComID'][index], item['RepComName'][index], item['RepAskQuestionTemp'][index], item['ReplyQuestion'][index], item['RepAskTime'][index], ReplyTimeTemp])
问题分析:问题出现在代码第一行:for index in item['RepAskName']:,keyerror说明找不到RepAskName,为什么会找不到呢,就需要看我们整体的代码了,我们发现第一次创建完了allQlist可能是正常的,再创建lastRlist时,还是会经过第一个for循环,这时item里只有RepAskName,RepComID,RepComName,RepAskQuestion,ReplyQuestion,RepAskTime,ReplyTime这几项,没有AskQuesName等,当然找不到AskQuesName啊。
for index in range(len(item['AskQuesName'])): self.allQlist.append( [item['AskQuesName'][index], item['AskedComID'][index], item['AskedComName'][index], item['AskQuestion'][index], item['AskTime'][index]]) tableprop = "cn_fansmost (id, name, allfans, allreply) values(%s, %s, %s, %s)" self.bulk_insert_to_mysql(self.allQlist, tableprop) for index in range(len(item['RepAskName'])): self.lastRlist.append([item['RepAskName'][index], item['RepComID'][index], item['RepComName'][index], item['RepAskQuestion'][index], item['ReplyQuestion'][index], item['RepAskTime'][index], item['ReplyTime'][index]])
问题解决:在每次循环之前进行判断,代码如下。
if 'AskQuesName' in item: for index in range(len(item['AskQuesName'])): self.allQlist.append( [item['AskQuesName'][index], item['AskedComID'][index], item['AskedComName'][index], item['AskQuestion'][index], item['AskTime'][index]]) tableprop = "cn_fansmost (id, name, allfans, allreply) values(%s, %s, %s, %s)" self.bulk_insert_to_mysql(self.allQlist, tableprop) if 'RepAskName' in item: for index in range(len(item['RepAskName'])): self.lastRlist.append([item['RepAskName'][index], item['RepComID'][index], item['RepComName'][index], item['RepAskQuestion'][index], item['ReplyQuestion'][index], item['RepAskTime'][index], item['ReplyTime'][index]])
至此问题便解决了。由此可见,item本身是很方便简单的,但是要结合scrapy的异步多线程特性设计好程序的逻辑。