playwright实战:某网站舆情爬取

网站链接:

'aHR0cDovL3d3dy5jdXN0b21zLmdvdi5jbi9jdXN0b21zLzMwMjI0OS8zMDIyNzAvMzAyMjcyL2luZGV4Lmh0bWw='

反爬技术:加速乐+数5

我一实习生搞这些,不废话了直接上自动化才艺

 展示:

import hashlib
import logging
import re
import redis
from lxml import etree
from datetime import datetime
from pymongo import MongoClient
from playwright.sync_api import Playwright, sync_playwright


client = MongoClient("mongodb://xx:xx/")
db = client["xx"]
collection = db["xx"]
def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(
     

你可能感兴趣的:(爬虫)