如何运用python爬虫获取大型资讯类网站文章,并同时导出pdf或word格式文本?

这里,我们以比较知名的商业新知网站https://www.shangyexinzhi.com/为例进行代码编写,下面进行代码应用思路。

第一部分,分析网站结构

首先,我们来分析,要使用Python技术分析一个网站的结构,通常可以通过以下步骤实现:

  1. 获取网站的HTML内容:使用requests库来获取网站的HTML源代码。
  2. 解析HTML内容:使用BeautifulSoup库来解析HTML,提取网站的结构信息,如导航栏、链接、标题等。
  3. 分析网站结构:通过提取的HTML元素,分析网站的布局和结构。

以下是一个示例代码,展示如何使用Python分析商业新知网站的结构:

Python代码示例

Python复制

import requests
from bs4 import BeautifulSoup

# 目标网站URL
url = "https://www.shangyexinzhi.com/"

# 发送HTTP请求获取HTML内容
response = requests.get(url)
response.encoding = "utf-8"  # 确保编码正确

# 检查请求是否成功
if response.status_code == 200:
    html_content = response.text
else:
    print("Failed to retrieve the webpage")
    exit()

# 使用BeautifulSoup解析HTML
soup = BeautifulSoup(html_content, "html.parser")

# 提取网站标题
title = soup.find("title").text
print(f"Website Title: {title}")

# 提取导航栏链接
nav_links = soup.find_all("a", class_="nav-link")  # 假设导航栏链接有特定的class
print("\nNavigation Links:")
for link in nav_links:
    print(f"{link.text.strip()} -> {link.get('href')}")

# 提取所有一级标题(H1)
h1_tags = soup.find_all("h1")
print("\nH1 Tags:")
for h1 in h1_tags:
    print(h1.text.strip())

# 提取所有二级标题(H2)
h2_tags = soup.find_all("h2")
print("\nH2 Tags:")
for h2 in h2_tags:
    print(h2.text.strip())

# 提取所有链接
all_links = soup.find_all("a")
print("\nAll Links:")
for link in all_links:
    href = link.get("href")
    text = link.text.strip()
    if href and text:
        print(f"{text} -> {href}")

# 提取网站底部信息
footer = soup.find("footer")  # 假设网站底部有footer标签
if footer:
    print("\nFooter Content:")
    print(footer.text.strip())

分析结果

运行上述代码后,你可以得到以下信息:

  1. 网站标题:提取</code>标签的内容。</li> <li><strong>导航栏链接</strong>:提取导航栏中的所有链接及其文本。</li> <li><strong>一级标题(H1)和二级标题(H2)</strong>:提取页面中所有<code><h1></code>和<code><h2></code>标签的内容。</li> <li><strong>所有链接</strong>:提取页面中所有<code><a></code>标签的<code>href</code>属性和文本。</li> <li><strong>底部信息</strong>:提取<code><footer></code>标签的内容。</li> </ol> <h4>注意事项</h4> <ol> <li><strong>动态内容</strong>:如果网站内容是通过JavaScript动态加载的,仅使用<code>requests</code>和<code>BeautifulSoup</code>可能无法获取完整内容。在这种情况下,可以使用<code>Selenium</code>来模拟浏览器行为。</li> <li><strong>网站结构变化</strong>:网站的HTML结构可能会随时更新,因此代码可能需要根据实际情况进行调整。</li> <li><strong>遵守robots.txt</strong>:在爬取网站内容时,请确保遵守网站的<code>robots.txt</code>文件规则,避免违反网站的使用条款。</li> </ol> <h3><em>第二部分,爬取网站文章</em></h3> <p>要爬取商业新知网站的文章,可以按照以下步骤进行操作。这里结合了最新的搜索结果信息,确保方法的时效性和合规性。</p> <h4>1. 遵守Robots协议</h4> <p>在开始爬取之前,必须检查目标网站的<code>robots.txt</code>文件,以确定哪些页面是可以被爬取的。访问以下链接查看协议:</p> <pre><code>https://www.shangyexinzhi.com/robots.txt </code></pre> <p>确保你的爬虫不会访问被禁止的页面。</p> <h4>2. 分析网站结构</h4> <p>打开商业新知网站,使用浏览器的开发者工具(F12)检查文章页面的HTML结构。例如:</p> <ul> <li>文章列表可能包含在某个特定的<code><div></code>或<code><ul></code>标签中。</li> <li>每篇文章的标题和链接可能包含在<code><a></code>标签中。</li> <li>文章内容可能包含在<code><article></code>或<code><div></code>标签中。</li> </ul> <h4>3. 使用Python和BeautifulSoup爬取文章</h4> <p>以下是一个基于Python和BeautifulSoup的简单爬虫示例,用于爬取文章标题和链接:</p> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">import</span> requests <span class="token keyword">from</span> bs4 <span class="token keyword">import</span> BeautifulSoup <span class="token comment"># 目标网站URL</span> base_url <span class="token operator">=</span> <span class="token string">"https://www.shangyexinzhi.com"</span> <span class="token comment"># 发送HTTP请求获取首页内容</span> response <span class="token operator">=</span> requests<span class="token punctuation">.</span>get<span class="token punctuation">(</span>base_url<span class="token punctuation">)</span> response<span class="token punctuation">.</span>encoding <span class="token operator">=</span> <span class="token string">"utf-8"</span> <span class="token comment"># 检查请求是否成功</span> <span class="token keyword">if</span> response<span class="token punctuation">.</span>status_code <span class="token operator">==</span> <span class="token number">200</span><span class="token punctuation">:</span> html_content <span class="token operator">=</span> response<span class="token punctuation">.</span>text soup <span class="token operator">=</span> BeautifulSoup<span class="token punctuation">(</span>html_content<span class="token punctuation">,</span> <span class="token string">"html.parser"</span><span class="token punctuation">)</span> <span class="token comment"># 查找文章列表(根据实际HTML结构调整选择器)</span> articles <span class="token operator">=</span> soup<span class="token punctuation">.</span>find_all<span class="token punctuation">(</span><span class="token string">"a"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"article-link"</span><span class="token punctuation">)</span> <span class="token comment"># 示例选择器,需根据实际调整</span> <span class="token comment"># 提取文章标题和链接</span> <span class="token keyword">for</span> article <span class="token keyword">in</span> articles<span class="token punctuation">:</span> title <span class="token operator">=</span> article<span class="token punctuation">.</span>get_text<span class="token punctuation">(</span>strip<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span> link <span class="token operator">=</span> article<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"href"</span><span class="token punctuation">)</span> full_link <span class="token operator">=</span> base_url <span class="token operator">+</span> link <span class="token keyword">if</span> link<span class="token punctuation">.</span>startswith<span class="token punctuation">(</span><span class="token string">"/"</span><span class="token punctuation">)</span> <span class="token keyword">else</span> link <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"标题: </span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">\n链接: </span><span class="token interpolation"><span class="token punctuation">{</span>full_link<span class="token punctuation">}</span></span><span class="token string">\n"</span></span><span class="token punctuation">)</span> <span class="token keyword">else</span><span class="token punctuation">:</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"Failed to retrieve the webpage"</span><span class="token punctuation">)</span> </code></pre> <h4>4. 处理动态内容(可选)</h4> <p>如果文章内容是通过JavaScript动态加载的,可以使用Selenium来模拟浏览器行为。以下是一个Selenium示例:</p> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">from</span> selenium <span class="token keyword">import</span> webdriver <span class="token keyword">from</span> selenium<span class="token punctuation">.</span>webdriver<span class="token punctuation">.</span>chrome<span class="token punctuation">.</span>service <span class="token keyword">import</span> Service <span class="token keyword">from</span> selenium<span class="token punctuation">.</span>webdriver<span class="token punctuation">.</span>common<span class="token punctuation">.</span>by <span class="token keyword">import</span> By <span class="token keyword">from</span> bs4 <span class="token keyword">import</span> BeautifulSoup <span class="token comment"># 设置Selenium WebDriver</span> service <span class="token operator">=</span> Service<span class="token punctuation">(</span>executable_path<span class="token operator">=</span><span class="token string">'/path/to/chromedriver'</span><span class="token punctuation">)</span> <span class="token comment"># 替换为你的chromedriver路径</span> driver <span class="token operator">=</span> webdriver<span class="token punctuation">.</span>Chrome<span class="token punctuation">(</span>service<span class="token operator">=</span>service<span class="token punctuation">)</span> <span class="token keyword">try</span><span class="token punctuation">:</span> <span class="token comment"># 访问目标网页</span> driver<span class="token punctuation">.</span>get<span class="token punctuation">(</span>base_url<span class="token punctuation">)</span> driver<span class="token punctuation">.</span>implicitly_wait<span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span> <span class="token comment"># 等待10秒,确保页面加载完成</span> <span class="token comment"># 获取页面源代码</span> html <span class="token operator">=</span> driver<span class="token punctuation">.</span>page_source soup <span class="token operator">=</span> BeautifulSoup<span class="token punctuation">(</span>html<span class="token punctuation">,</span> <span class="token string">"html.parser"</span><span class="token punctuation">)</span> <span class="token comment"># 提取文章信息(根据实际HTML结构调整选择器)</span> articles <span class="token operator">=</span> soup<span class="token punctuation">.</span>find_all<span class="token punctuation">(</span><span class="token string">"a"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"article-link"</span><span class="token punctuation">)</span> <span class="token comment"># 示例选择器,需根据实际调整</span> <span class="token keyword">for</span> article <span class="token keyword">in</span> articles<span class="token punctuation">:</span> title <span class="token operator">=</span> article<span class="token punctuation">.</span>get_text<span class="token punctuation">(</span>strip<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span> link <span class="token operator">=</span> article<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"href"</span><span class="token punctuation">)</span> full_link <span class="token operator">=</span> base_url <span class="token operator">+</span> link <span class="token keyword">if</span> link<span class="token punctuation">.</span>startswith<span class="token punctuation">(</span><span class="token string">"/"</span><span class="token punctuation">)</span> <span class="token keyword">else</span> link <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"标题: </span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">\n链接: </span><span class="token interpolation"><span class="token punctuation">{</span>full_link<span class="token punctuation">}</span></span><span class="token string">\n"</span></span><span class="token punctuation">)</span> <span class="token keyword">finally</span><span class="token punctuation">:</span> driver<span class="token punctuation">.</span>quit<span class="token punctuation">(</span><span class="token punctuation">)</span> </code></pre> <h4>5. 保存文章内容</h4> <p>根据提取到的链接,进一步请求每篇文章的详细内容,并保存到本地或数据库。</p> <h4>注意事项</h4> <ol> <li><strong>遵守Robots协议</strong>:确保爬取的页面未被<code>robots.txt</code>禁止。</li> <li><strong>合理设置访问间隔</strong>:避免对服务器造成过大负担。</li> <li><strong>动态内容处理</strong>:如果页面内容是动态加载的,优先考虑Selenium。</li> <li><strong>尊重网站所有者意愿</strong>:如果网站明确禁止爬取,应停止相关操作。</li> </ol> <h3><em><strong>第三部分,怎么爬取该网站的所有文章内容并导出word或pdf格式文本</strong></em></h3> <p>要爬取商业新知网站的所有文章内容并导出为Word或PDF格式,可以按照以下步骤实现:</p> <h4><strong>步骤 1:爬取文章内容</strong></h4> <ol> <li><strong>分析网站结构</strong>:使用浏览器开发者工具(F12)查看文章页面的HTML结构,确定文章标题、内容等信息的标签。</li> <li><strong>编写爬虫代码</strong>:使用Python的<code>requests</code>和<code>BeautifulSoup</code>库爬取文章内容。如果页面内容是动态加载的,可以使用<code>Selenium</code>。</li> </ol> <h5>示例代码:</h5> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">import</span> requests <span class="token keyword">from</span> bs4 <span class="token keyword">import</span> BeautifulSoup <span class="token keyword">import</span> os <span class="token comment"># 创建保存文件的目录</span> <span class="token keyword">if</span> <span class="token keyword">not</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>exists<span class="token punctuation">(</span><span class="token string">"articles"</span><span class="token punctuation">)</span><span class="token punctuation">:</span> os<span class="token punctuation">.</span>makedirs<span class="token punctuation">(</span><span class="token string">"articles"</span><span class="token punctuation">)</span> <span class="token comment"># 爬取文章列表</span> <span class="token keyword">def</span> <span class="token function">get_article_list</span><span class="token punctuation">(</span>url<span class="token punctuation">)</span><span class="token punctuation">:</span> response <span class="token operator">=</span> requests<span class="token punctuation">.</span>get<span class="token punctuation">(</span>url<span class="token punctuation">)</span> soup <span class="token operator">=</span> BeautifulSoup<span class="token punctuation">(</span>response<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token string">"html.parser"</span><span class="token punctuation">)</span> articles <span class="token operator">=</span> soup<span class="token punctuation">.</span>find_all<span class="token punctuation">(</span><span class="token string">"a"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"article-link"</span><span class="token punctuation">)</span> <span class="token comment"># 根据实际HTML结构调整</span> <span class="token keyword">return</span> <span class="token punctuation">[</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> a<span class="token punctuation">[</span><span class="token string">"href"</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token keyword">for</span> a <span class="token keyword">in</span> articles<span class="token punctuation">]</span> <span class="token comment"># 爬取单篇文章内容</span> <span class="token keyword">def</span> <span class="token function">get_article_content</span><span class="token punctuation">(</span>url<span class="token punctuation">)</span><span class="token punctuation">:</span> response <span class="token operator">=</span> requests<span class="token punctuation">.</span>get<span class="token punctuation">(</span>url<span class="token punctuation">)</span> soup <span class="token operator">=</span> BeautifulSoup<span class="token punctuation">(</span>response<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token string">"html.parser"</span><span class="token punctuation">)</span> title <span class="token operator">=</span> soup<span class="token punctuation">.</span>find<span class="token punctuation">(</span><span class="token string">"h1"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 文章标题</span> content <span class="token operator">=</span> soup<span class="token punctuation">.</span>find<span class="token punctuation">(</span><span class="token string">"div"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"article-content"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 文章内容</span> <span class="token keyword">return</span> title<span class="token punctuation">,</span> content <span class="token comment"># 主函数</span> <span class="token keyword">def</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span> base_url <span class="token operator">=</span> <span class="token string">"https://www.shangyexinzhi.com/"</span> articles <span class="token operator">=</span> get_article_list<span class="token punctuation">(</span>base_url<span class="token punctuation">)</span> <span class="token keyword">for</span> title<span class="token punctuation">,</span> link <span class="token keyword">in</span> articles<span class="token punctuation">:</span> full_url <span class="token operator">=</span> base_url <span class="token operator">+</span> link article_title<span class="token punctuation">,</span> article_content <span class="token operator">=</span> get_article_content<span class="token punctuation">(</span>full_url<span class="token punctuation">)</span> save_article<span class="token punctuation">(</span>article_title<span class="token punctuation">,</span> article_content<span class="token punctuation">)</span> <span class="token comment"># 保存文章为文本文件</span> <span class="token keyword">def</span> <span class="token function">save_article</span><span class="token punctuation">(</span>title<span class="token punctuation">,</span> content<span class="token punctuation">)</span><span class="token punctuation">:</span> filename <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"articles/</span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">.txt"</span></span> <span class="token keyword">with</span> <span class="token builtin">open</span><span class="token punctuation">(</span>filename<span class="token punctuation">,</span> <span class="token string">"w"</span><span class="token punctuation">,</span> encoding<span class="token operator">=</span><span class="token string">"utf-8"</span><span class="token punctuation">)</span> <span class="token keyword">as</span> f<span class="token punctuation">:</span> f<span class="token punctuation">.</span>write<span class="token punctuation">(</span>content<span class="token punctuation">)</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Saved: </span><span class="token interpolation"><span class="token punctuation">{</span>filename<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span> <span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">"__main__"</span><span class="token punctuation">:</span> main<span class="token punctuation">(</span><span class="token punctuation">)</span> </code></pre> <h4><strong>步骤 2:将文章内容导出为Word或PDF</strong></h4> <ol> <li><strong>导出为Word</strong>:使用<code>python-docx</code>库将文章内容保存为Word文档。</li> <li><strong>导出为PDF</strong>:使用<code>pdfkit</code>库将HTML内容转换为PDF。</li> </ol> <h5>示例代码(导出为Word):</h5> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">from</span> docx <span class="token keyword">import</span> Document <span class="token keyword">def</span> <span class="token function">save_as_word</span><span class="token punctuation">(</span>title<span class="token punctuation">,</span> content<span class="token punctuation">)</span><span class="token punctuation">:</span> doc <span class="token operator">=</span> Document<span class="token punctuation">(</span><span class="token punctuation">)</span> doc<span class="token punctuation">.</span>add_heading<span class="token punctuation">(</span>title<span class="token punctuation">,</span> level<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">)</span> doc<span class="token punctuation">.</span>add_paragraph<span class="token punctuation">(</span>content<span class="token punctuation">)</span> filename <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"articles/</span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">.docx"</span></span> doc<span class="token punctuation">.</span>save<span class="token punctuation">(</span>filename<span class="token punctuation">)</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Saved as Word: </span><span class="token interpolation"><span class="token punctuation">{</span>filename<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span> </code></pre> <h5>示例代码(导出为PDF):</h5> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">import</span> pdfkit <span class="token keyword">def</span> <span class="token function">save_as_pdf</span><span class="token punctuation">(</span>title<span class="token punctuation">,</span> content<span class="token punctuation">)</span><span class="token punctuation">:</span> html_content <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"<h1></span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string"></h1><p></span><span class="token interpolation"><span class="token punctuation">{</span>content<span class="token punctuation">}</span></span><span class="token string"></p>"</span></span> filename <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"articles/</span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">.pdf"</span></span> pdfkit<span class="token punctuation">.</span>from_string<span class="token punctuation">(</span>html_content<span class="token punctuation">,</span> filename<span class="token punctuation">)</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Saved as PDF: </span><span class="token interpolation"><span class="token punctuation">{</span>filename<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span> </code></pre> <h4><strong>步骤 3:整合代码</strong></h4> <p>将上述代码整合到主函数中,爬取文章并导出为Word或PDF格式。</p> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">def</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span> base_url <span class="token operator">=</span> <span class="token string">"https://www.shangyexhi.com/"</span> articles <span class="token operator">=</span> get_article_list<span class="token punctuation">(</span>base_url<span class="token punctuation">)</span> <span class="token keyword">for</span> title<span class="token punctuation">,</span> link <span class="token keyword">in</span> articles<span class="token punctuation">:</span> full_url <span class="token operator">=</span> base_url <span class="token operator">+</span> link article_title<span class="token punctuation">,</span> article_content <span class="token operator">=</span> get_article_content<span class="token punctuation">(</span>full_url<span class="token punctuation">)</span> save_as_word<span class="token punctuation">(</span>article_title<span class="token punctuation">,</span> article_content<span class="token punctuation">)</span> <span class="token comment"># 保存为Word</span> save_as_pdf<span class="token punctuation">(</span>article_title<span class="token punctuation">,</span> article_content<span class="token punctuation">)</span> <span class="token comment"># 保存为PDF</span> </code></pre> <h4><strong>注意事项</strong></h4> <ol> <li><strong>遵守Robots协议</strong>:确保爬取的页面未被<code>robots.txt</code>禁止。</li> <li><strong>动态内容处理</strong>:如果文章内容是动态加载的,优先使用<code>Selenium</code>。</li> <li><strong>文件名处理</strong>:确保文件名中不包含特殊字符,避免保存失败。</li> </ol> <p>通过上述步骤,你可以爬取商业新知网站的所有文章内容,并将其导出为Word或PDF格式。如果遇到问题,可以参考相关的技术博客。</p> <h3><em>第四部分,运行python爬虫抓取文章的全部代码</em></h3> <p>这段代码不能直接运行,因为它是基于假设的HTML结构编写的,而实际网站的HTML结构可能与代码中的假设不同。此外,代码中还缺少一些必要的配置和依赖项。以下是需要调整和补充的地方,以确保代码能够正常运行:</p> <h4>1. <strong>调整HTML选择器</strong></h4> <p>在代码中,我使用了假设的HTML选择器(如<code>class_="article-link"</code>和<code>class_="article-content"</code>)。你需要根据实际网站的HTML结构进行调整。以下是调整方法:</p> <h5>检查HTML结构</h5> <ol> <li>打开商业新知网站。</li> <li>使用浏览器的开发者工具(F12)检查文章列表和文章内容的实际HTML结构。</li> <li>找到文章标题、链接和内容所在的HTML标签和类名。</li> </ol> <h5>示例调整</h5> <p>假设文章列表的HTML结构如下:</p> <p>HTML复制</p> <pre><code class="prism language-html"><span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>article-list<span class="token punctuation">"</span></span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>/article/123<span class="token punctuation">"</span></span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>article-title<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>文章标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span> ... <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span> </code></pre> <p>文章内容的HTML结构如下:</p> <p>HTML复制</p> <pre><code class="prism language-html"><span class="token tag"><span class="token tag"><span class="token punctuation"><</span>article</span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>h1</span><span class="token punctuation">></span></span>文章标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>h1</span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>content<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>文章内容<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>article</span><span class="token punctuation">></span></span> </code></pre> <p>你需要将代码中的选择器调整为:</p> <p>Python复制</p> <pre><code class="prism language-python">articles <span class="token operator">=</span> soup<span class="token punctuation">.</span>find_all<span class="token punctuation">(</span><span class="token string">"a"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"article-title"</span><span class="token punctuation">)</span> <span class="token comment"># 文章列表</span> title <span class="token operator">=</span> soup<span class="token punctuation">.</span>find<span class="token punctuation">(</span><span class="token string">"h1"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 文章标题</span> content <span class="token operator">=</span> soup<span class="token punctuation">.</span>find<span class="token punctuation">(</span><span class="token string">"div"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"content"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 文章内容</span> </code></pre> <h4>2. <strong>安装必要的Python库</strong></h4> <p>代码中使用了多个Python库(如<code>requests</code>、<code>BeautifulSoup</code>、<code>pdfkit</code>、<code>python-docx</code>等)。你需要确保这些库已经安装。可以通过以下命令安装:</p> <p>bash复制</p> <pre><code class="prism language-bash">pip <span class="token function">install</span> requests beautifulsoup4 pdfkit python-docx </code></pre> <h4>3. <strong>配置pdfkit</strong></h4> <p><code>pdfkit</code>需要一个后端工具(如<code>wkhtmltopdf</code>)来将HTML转换为PDF。你需要先安装<code>wkhtmltopdf</code>:</p> <ul> <li> <p><strong>Windows</strong>: 下载并安装 wkhtmltopdf。</p> </li> <li> <p><strong>macOS/Linux</strong>: 使用包管理器安装:</p> <p>bash复制</p> <pre><code class="prism language-bash">brew <span class="token function">install</span> wkhtmltopdf <span class="token comment"># macOS</span> <span class="token function">sudo</span> <span class="token function">apt-get</span> <span class="token function">install</span> wkhtmltopdf <span class="token comment"># Ubuntu</span> </code></pre> </li> </ul> <p>安装完成后,确保<code>wkhtmltopdf</code>的路径已添加到系统环境变量中。</p> <h4>4. <strong>处理文件名中的特殊字符</strong></h4> <p>文件名中可能包含特殊字符(如<code>/</code>、<code>\</code>、<code>:</code>等),这会导致保存文件时出错。需要对文件名进行处理:</p> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">import</span> re <span class="token keyword">def</span> <span class="token function">clean_filename</span><span class="token punctuation">(</span>filename<span class="token punctuation">)</span><span class="token punctuation">:</span> <span class="token keyword">return</span> re<span class="token punctuation">.</span>sub<span class="token punctuation">(</span><span class="token string">r'[\\/*?:"<>|]'</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">,</span> filename<span class="token punctuation">)</span> </code></pre> <p>在保存文件时,使用<code>clean_filename</code>函数处理文件名:</p> <p>Python复制</p> <pre><code class="prism language-python">filename <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"articles/</span><span class="token interpolation"><span class="token punctuation">{</span>clean_filename<span class="token punctuation">(</span>title<span class="token punctuation">)</span><span class="token punctuation">}</span></span><span class="token string">.docx"</span></span> </code></pre> <h4>5. <strong>完整代码示例</strong></h4> <p>以下是调整后的完整代码示例:</p> <p>Python复制</p> <pre><code class="prism language-python"><span class="token keyword">import</span> requests <span class="token keyword">from</span> bs4 <span class="token keyword">import</span> BeautifulSoup <span class="token keyword">import</span> os <span class="token keyword">import</span> re <span class="token keyword">from</span> docx <span class="token keyword">import</span> Document <span class="token keyword">import</span> pdfkit <span class="token comment"># 创建保存文件的目录</span> <span class="token keyword">if</span> <span class="token keyword">not</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>exists<span class="token punctuation">(</span><span class="token string">"articles"</span><span class="token punctuation">)</span><span class="token punctuation">:</span> os<span class="token punctuation">.</span>makedirs<span class="token punctuation">(</span><span class="token string">"articles"</span><span class="token punctuation">)</span> <span class="token comment"># 清理文件名中的特殊字符</span> <span class="token keyword">def</span> <span class="token function">clean_filename</span><span class="token punctuation">(</span>filename<span class="token punctuation">)</span><span class="token punctuation">:</span> <span class="token keyword">return</span> re<span class="token punctuation">.</span>sub<span class="token punctuation">(</span><span class="token string">r'[\\/*?:"<>|]'</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">,</span> filename<span class="token punctuation">)</span> <span class="token comment"># 爬取文章列表</span> <span class="token keyword">def</span> <span class="token function">get_article_list</span><span class="token punctuation">(</span>url<span class="token punctuation">)</span><span class="token punctuation">:</span> response <span class="token operator">=</span> requests<span class="token punctuation">.</span>get<span class="token punctuation">(</span>url<span class="token punctuation">)</span> soup <span class="token operator">=</span> BeautifulSoup<span class="token punctuation">(</span>response<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token string">"html.parser"</span><span class="token punctuation">)</span> articles <span class="token operator">=</span> soup<span class="token punctuation">.</span>find_all<span class="token punctuation">(</span><span class="token string">"a"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"article-title"</span><span class="token punctuation">)</span> <span class="token comment"># 根据实际HTML结构调整</span> <span class="token keyword">return</span> <span class="token punctuation">[</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> a<span class="token punctuation">[</span><span class="token string">"href"</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token keyword">for</span> a <span class="token keyword">in</span> articles<span class="token punctuation">]</span> <span class="token comment"># 爬取单篇文章内容</span> <span class="token keyword">def</span> <span class="token function">get_article_content</span><span class="token punctuation">(</span>url<span class="token punctuation">)</span><span class="token punctuation">:</span> response <span class="token operator">=</span> requests<span class="token punctuation">.</span>get<span class="token punctuation">(</span>url<span class="token punctuation">)</span> soup <span class="token operator">=</span> BeautifulSoup<span class="token punctuation">(</span>response<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token string">"html.parser"</span><span class="token punctuation">)</span> title <span class="token operator">=</span> soup<span class="token punctuation">.</span>find<span class="token punctuation">(</span><span class="token string">"h1"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 文章标题</span> content <span class="token operator">=</span> soup<span class="token punctuation">.</span>find<span class="token punctuation">(</span><span class="token string">"div"</span><span class="token punctuation">,</span> class_<span class="token operator">=</span><span class="token string">"content"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>text<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 文章内容</span> <span class="token keyword">return</span> title<span class="token punctuation">,</span> content <span class="token comment"># 保存文章为Word文档</span> <span class="token keyword">def</span> <span class="token function">save_as_word</span><span class="token punctuation">(</span>title<span class="token punctuation">,</span> content<span class="token punctuation">)</span><span class="token punctuation">:</span> doc <span class="token operator">=</span> Document<span class="token punctuation">(</span><span class="token punctuation">)</span> doc<span class="token punctuation">.</span>add_heading<span class="token punctuation">(</span>title<span class="token punctuation">,</span> level<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">)</span> doc<span class="token punctuation">.</span>add_paragraph<span class="token punctuation">(</span>content<span class="token punctuation">)</span> filename <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"articles/</span><span class="token interpolation"><span class="token punctuation">{</span>clean_filename<span class="token punctuation">(</span>title<span class="token punctuation">)</span><span class="token punctuation">}</span></span><span class="token string">.docx"</span></span> doc<span class="token punctuation">.</span>save<span class="token punctuation">(</span>filename<span class="token punctuation">)</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Saved as Word: </span><span class="token interpolation"><span class="token punctuation">{</span>filename<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span> <span class="token comment"># 保存文章为PDF文档</span> <span class="token keyword">def</span> <span class="token function">save_as_pdf</span><span class="token punctuation">(</span>title<span class="token punctuation">,</span> content<span class="token punctuation">)</span><span class="token punctuation">:</span> html_content <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"<h1></span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string"></h1><p></span><span class="token interpolation"><span class="token punctuation">{</span>content<span class="token punctuation">}</span></span><span class="token string"></p>"</span></span> filename <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"articles/</span><span class="token interpolation"><span class="token punctuation">{</span>clean_filename<span class="token punctuation">(</span>title<span class="token punctuation">)</span><span class="token punctuation">}</span></span><span class="token string">.pdf"</span></span> pdfkit<span class="token punctuation">.</span>from_string<span class="token punctuation">(</span>html_content<span class="token punctuation">,</span> filename<span class="token punctuation">)</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Saved as PDF: </span><span class="token interpolation"><span class="token punctuation">{</span>filename<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span> <span class="token comment"># 主函数</span> <span class="token keyword">def</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span> base_url <span class="token operator">=</span> <span class="token string">"https://www.shangyexinzhi.com/"</span> articles <span class="token operator">=</span> get_article_list<span class="token punctuation">(</span>base_url<span class="token punctuation">)</span> <span class="token keyword">for</span> title<span class="token punctuation">,</span> link <span class="token keyword">in</span> articles<span class="token punctuation">:</span> full_url <span class="token operator">=</span> base_url <span class="token operator">+</span> link article_title<span class="token punctuation">,</span> article_content <span class="token operator">=</span> get_article_content<span class="token punctuation">(</span>full_url<span class="token punctuation">)</span> save_as_word<span class="token punctuation">(</span>article_title<span class="token punctuation">,</span> article_content<span class="token punctuation">)</span> <span class="token comment"># 保存为Word</span> save_as_pdf<span class="token punctuation">(</span>article_title<span class="token punctuation">,</span> article_content<span class="token punctuation">)</span> <span class="token comment"># 保存为PDF</span> <span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">"__main__"</span><span class="token punctuation">:</span> main<span class="token punctuation">(</span><span class="token punctuation">)</span> </code></pre> <h4>6. <strong>运行代码</strong></h4> <p>在运行代码之前,请确保:</p> <ul> <li>已安装所有必要的Python库。</li> <li>已安装<code>wkhtmltopdf</code>并配置好路径。</li> <li>已根据实际网站的HTML结构调整选择器。</li> </ul> <p>运行代码后,爬取的文章内容将被保存为Word和PDF格式,存储在<code>articles</code>文件夹中。如果在运行过程中遇到问题,可以根据错误信息进一步调整代码或检查网站的HTML结构变化。</p> </div> </div> </div> </div> </div> <!--PC和WAP自适应版--> <div id="SOHUCS" sid="1883060695470239744"></div> <script type="text/javascript" src="/views/front/js/chanyan.js"></script> <!-- 文章页-底部 动态广告位 --> <div class="youdao-fixed-ad" id="detail_ad_bottom"></div> </div> <div class="col-md-3"> <div class="row" id="ad"> <!-- 文章页-右侧1 动态广告位 --> <div id="right-1" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad"> <div class="youdao-fixed-ad" id="detail_ad_1"> </div> </div> <!-- 文章页-右侧2 动态广告位 --> <div id="right-2" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad"> <div class="youdao-fixed-ad" id="detail_ad_2"></div> </div> <!-- 文章页-右侧3 动态广告位 --> <div id="right-3" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad"> <div class="youdao-fixed-ad" id="detail_ad_3"></div> </div> </div> </div> </div> </div> </div> <div class="container"> <h4 class="pt20 mb15 mt0 border-top">你可能感兴趣的:(深度学习,python,网络爬虫,自然语言处理)</h4> <div id="paradigm-article-related"> <div class="recommend-post mb30"> <ul class="widget-links"> <li><a href="/article/1950233451282100224.htm" title="python 读excel每行替换_Python脚本操作Excel实现批量替换功能" target="_blank">python 读excel每行替换_Python脚本操作Excel实现批量替换功能</a> <span class="text-muted">weixin_39646695</span> <a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E8%AF%BBexcel%E6%AF%8F%E8%A1%8C%E6%9B%BF%E6%8D%A2/1.htm">读excel每行替换</a> <div>Python脚本操作Excel实现批量替换功能大家好,给大家分享下如何使用Python脚本操作Excel实现批量替换。使用的工具Openpyxl,一个处理excel的python库,处理excel,其实针对的就是WorkBook,Sheet,Cell这三个最根本的元素~明确需求原始excel如下我们的目标是把下面excel工作表的sheet1表页A列的内容“替换我吧”批量替换为B列的“我用来替换的</div> </li> <li><a href="/article/1950208107430866944.htm" title="python笔记14介绍几个魔法方法" target="_blank">python笔记14介绍几个魔法方法</a> <span class="text-muted">抢公主的大魔王</span> <a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a> <div>python笔记14介绍几个魔法方法先声明一下各位大佬,这是我的笔记。如有错误,恳请指正。另外,感谢您的观看,谢谢啦!(1).__doc__输出对应的函数,类的说明文档print(print.__doc__)print(value,...,sep='',end='\n',file=sys.stdout,flush=False)Printsthevaluestoastream,ortosys.std</div> </li> <li><a href="/article/1950204954295726080.htm" title="Anaconda 和 Miniconda:功能详解与选择建议" target="_blank">Anaconda 和 Miniconda:功能详解与选择建议</a> <span class="text-muted">古月฿</span> <a class="tag" taget="_blank" href="/search/python%E5%85%A5%E9%97%A8/1.htm">python入门</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/conda/1.htm">conda</a> <div>Anaconda和Miniconda详细介绍一、Anaconda的详细介绍1.什么是Anaconda?Anaconda是一个开源的包管理和环境管理工具,在数据科学、机器学习以及科学计算领域发挥着关键作用。它以Python和R语言为基础,为用户精心准备了大量预装库和工具,极大地缩短了搭建数据科学环境的时间。对于那些想要快速开展数据分析、模型训练等工作的人员来说,Anaconda就像是一个一站式的“数</div> </li> <li><a href="/article/1950204701714739200.htm" title="环境搭建 | Python + Anaconda / Miniconda + PyCharm 的安装、配置与使用" target="_blank">环境搭建 | Python + Anaconda / Miniconda + PyCharm 的安装、配置与使用</a> <span class="text-muted"></span> <div>本文将分别介绍Python、Anaconda/Miniconda、PyCharm的安装、配置与使用,详细介绍Python环境搭建的全过程,涵盖Python、Pip、PythonLauncher、Anaconda、Miniconda、Pycharm等内容,以官方文档为参照,使用经验为补充,内容全面而详实。由于图片太多,就先贴一个无图简化版吧,详情请查看Python+Anaconda/Minicond</div> </li> <li><a href="/article/1950202938265759744.htm" title="你竟然还在用克隆删除?Conda最新版rename命令全攻略!" target="_blank">你竟然还在用克隆删除?Conda最新版rename命令全攻略!</a> <span class="text-muted">曦紫沐</span> <a class="tag" taget="_blank" href="/search/Python%E5%9F%BA%E7%A1%80%E7%9F%A5%E8%AF%86/1.htm">Python基础知识</a><a class="tag" taget="_blank" href="/search/conda/1.htm">conda</a><a class="tag" taget="_blank" href="/search/%E8%99%9A%E6%8B%9F%E7%8E%AF%E5%A2%83%E7%AE%A1%E7%90%86/1.htm">虚拟环境管理</a> <div>文章摘要Conda虚拟环境管理终于迎来革命性升级!本文揭秘Conda4.9+版本新增的rename黑科技,彻底告别传统“克隆+删除”的繁琐操作。从命令解析到实战案例,手把手教你如何安全高效地重命名Python虚拟环境,附带版本检测、环境迁移、故障排查等进阶技巧,助你提升开发效率10倍!一、颠覆认知:Conda居然自带重命名功能?很多开发者仍停留在“Conda无法直接重命名环境”的认知阶段,实际上自</div> </li> <li><a href="/article/1950202054706262016.htm" title="centos7安装配置 Anaconda3" target="_blank">centos7安装配置 Anaconda3</a> <span class="text-muted"></span> <div>Anaconda是一个用于科学计算的Python发行版,Anaconda于Python,相当于centos于linux。下载[root@testsrc]#mwgethttps://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-5.2.0-Linux-x86_64.shBegintodownload:Anaconda3-5.2.0-L</div> </li> <li><a href="/article/1950202054219722752.htm" title="Pandas:数据科学的超级瑞士军刀" target="_blank">Pandas:数据科学的超级瑞士军刀</a> <span class="text-muted">科技林总</span> <a class="tag" taget="_blank" href="/search/DeepSeek%E5%AD%A6AI/1.htm">DeepSeek学AI</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a> <div>**——从零基础到高效分析的进化指南**###**一、Pandas诞生:数据革命的救世主****2010年前的数据分析噩梦**:```python#传统Python处理表格数据data=[]forrowincsv_file:ifrow[3]>100androw[2]=="China":data.append(float(row[5])#代码冗长易错!```**核心痛点**:-Excel处理百万行崩</div> </li> <li><a href="/article/1950195876991397888.htm" title="【Jupyter】个人开发常见命令" target="_blank">【Jupyter】个人开发常见命令</a> <span class="text-muted">TIM老师</span> <a class="tag" taget="_blank" href="/search/%23/1.htm">#</a><a class="tag" taget="_blank" href="/search/Pycharm/1.htm">Pycharm</a><a class="tag" taget="_blank" href="/search/%26amp%3B/1.htm">&</a><a class="tag" taget="_blank" href="/search/VSCode/1.htm">VSCode</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/Jupyter/1.htm">Jupyter</a> <div>1.查看python版本importsysprint(sys.version)2.ipynb/py文件转换jupyternbconvert--topythonmy_file.ipynbipynb转换为mdjupyternbconvert--tomdmy_file.ipynbipynb转为htmljupyternbconvert--tohtmlmy_file.ipynbipython转换为pdfju</div> </li> <li><a href="/article/1950194741610082304.htm" title="AI 生成虚拟宠物:24 小时陪你聊天解闷" target="_blank">AI 生成虚拟宠物:24 小时陪你聊天解闷</a> <span class="text-muted">大力出奇迹985</span> <a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E5%AE%A0%E7%89%A9/1.htm">宠物</a> <div>本文围绕AI生成虚拟宠物展开,介绍这类依托人工智能技术诞生的虚拟伙伴,能实现24小时不间断陪伴聊天,为人们解闷。文中详细阐述其技术基础,包括自然语言处理、机器学习等;分析多样功能,如个性化互动、情绪回应等;探讨在独居人群、压力大者等不同群体中的应用场景,最后总结其为人们生活带来的积极影响及未来发展潜力,展现AI虚拟宠物在陪伴领域的独特价值。一、AI生成虚拟宠物的诞生背景与技术基石在快节奏的现代社会</div> </li> <li><a href="/article/1950194363237724160.htm" title="用 Python 开发小游戏:零基础也能做出《贪吃蛇》" target="_blank">用 Python 开发小游戏:零基础也能做出《贪吃蛇》</a> <span class="text-muted"></span> <div>本文专为零基础学习者打造,详细介绍如何用Python开发经典小游戏《贪吃蛇》。无需复杂编程知识,从环境搭建到代码编写、功能实现,逐步讲解核心逻辑与操作。涵盖Pygame库的基础运用、游戏界面设计、蛇的移动与食物生成规则等,让新手能按步骤完成开发,同时融入SEO优化要点,帮助读者轻松入门Python游戏开发,体验从0到1做出游戏的乐趣。一、为什么选择用Python开发《贪吃蛇》对于零基础学习者来说,</div> </li> <li><a href="/article/1950193733681082368.htm" title="基于Python的AI健康助手:开发与部署全攻略" target="_blank">基于Python的AI健康助手:开发与部署全攻略</a> <span class="text-muted">AI算力网络与通信</span> <a class="tag" taget="_blank" href="/search/AI%E7%AE%97%E5%8A%9B%E7%BD%91%E7%BB%9C%E4%B8%8E%E9%80%9A%E4%BF%A1%E5%8E%9F%E7%90%86/1.htm">AI算力网络与通信原理</a><a class="tag" taget="_blank" href="/search/AI%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD%E5%A4%A7%E6%95%B0%E6%8D%AE%E6%9E%B6%E6%9E%84/1.htm">AI人工智能大数据架构</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/ai/1.htm">ai</a> <div>基于Python的AI健康助手:开发与部署全攻略关键词:Python、AI健康助手、机器学习、自然语言处理、Flask、部署、健康管理摘要:本文将详细介绍如何使用Python开发一个AI健康助手,从需求分析、技术选型到核心功能实现,再到最终部署上线的完整过程。我们将使用自然语言处理技术理解用户健康咨询,通过机器学习模型提供个性化建议,并展示如何用Flask框架构建Web应用接口。文章包含大量实际代</div> </li> <li><a href="/article/1950192848833933312.htm" title="数据分析领域中AI人工智能的发展前景展望" target="_blank">数据分析领域中AI人工智能的发展前景展望</a> <span class="text-muted">AI大模型应用工坊</span> <a class="tag" taget="_blank" href="/search/AI%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%BC%80%E5%8F%91%E5%AE%9E%E6%88%98/1.htm">AI大模型开发实战</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/1.htm">数据分析</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E6%8C%96%E6%8E%98/1.htm">数据挖掘</a><a class="tag" taget="_blank" href="/search/ai/1.htm">ai</a> <div>数据分析领域中AI人工智能的发展前景展望关键词:数据分析、人工智能、机器学习、深度学习、数据挖掘、预测分析、自动化摘要:本文深入探讨了人工智能在数据分析领域的发展现状和未来趋势。我们将从核心技术原理出发,分析AI如何改变传统数据分析范式,详细讲解机器学习算法在数据分析中的应用,并通过实际案例展示AI驱动的数据分析解决方案。文章还将探讨行业应用场景、工具生态以及未来发展面临的挑战和机遇,为数据分析师</div> </li> <li><a href="/article/1950192849786040320.htm" title="AI人工智能中的数据挖掘:提升智能决策能力" target="_blank">AI人工智能中的数据挖掘:提升智能决策能力</a> <span class="text-muted"></span> <div>AI人工智能中的数据挖掘:提升智能决策能力关键词:数据挖掘、人工智能、机器学习、智能决策、数据分析、特征工程、模型优化摘要:本文深入探讨了数据挖掘在人工智能领域中的核心作用,重点分析了如何通过数据挖掘技术提升智能决策能力。文章从基础概念出发,详细介绍了数据挖掘的关键算法、数学模型和实际应用场景,并通过Python代码示例展示了数据挖掘的全流程。最后,文章展望了数据挖掘技术的未来发展趋势和面临的挑战</div> </li> <li><a href="/article/1950192217708621824.htm" title="lesson20:Python函数的标注" target="_blank">lesson20:Python函数的标注</a> <span class="text-muted">你的电影很有趣</span> <a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a> <div>目录引言:为什么函数标注是现代Python开发的必备技能一、函数标注的基础语法1.1参数与返回值标注1.2支持的标注类型1.3Python3.9+的重大改进:标准集合泛型二、高级标注技巧与最佳实践2.1复杂参数结构标注2.2函数类型与回调标注2.3变量注解与类型别名三、静态类型检查工具应用3.1mypy:最流行的类型检查器3.2Pyright与IDE集成3.3运行时类型验证四、函数标注的工程价值与</div> </li> <li><a href="/article/1950190325960077312.htm" title="Jupyter Notebook:数据科学的“瑞士军刀”" target="_blank">Jupyter Notebook:数据科学的“瑞士军刀”</a> <span class="text-muted">a小胡哦</span> <a class="tag" taget="_blank" href="/search/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80/1.htm">机器学习基础</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/1.htm">机器学习</a> <div>在数据科学的世界里,JupyterNotebook是一个不可或缺的工具,它就像是数据科学家手中的“瑞士军刀”,功能强大且灵活多变。今天,就让我们一起深入了解这个神奇的工具。一、JupyterNotebook是什么?JupyterNotebook是一个开源的Web应用程序,它允许你创建和共享包含实时代码、方程、可视化和解释性文本的文档。它支持多种编程语言,其中Python是最常用的语言之一。Jupy</div> </li> <li><a href="/article/1950187554129113088.htm" title="Django学习笔记(一)" target="_blank">Django学习笔记(一)</a> <span class="text-muted"></span> <div>学习视频为:pythondjangoweb框架开发入门全套视频教程一、安装pipinstalldjango==****检查是否安装成功django.get_version()二、django新建项目操作1、新建一个项目django-adminstartprojectproject_name2、新建APPcdproject_namedjango-adminstartappApp注:一个project</div> </li> <li><a href="/article/1950185789447008256.htm" title="Python 程序设计讲义(26):字符串的用法——字符的编码" target="_blank">Python 程序设计讲义(26):字符串的用法——字符的编码</a> <span class="text-muted">睿思达DBA_WGX</span> <a class="tag" taget="_blank" href="/search/Python/1.htm">Python</a><a class="tag" taget="_blank" href="/search/%E8%AE%B2%E4%B9%89/1.htm">讲义</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a> <div>Python程序设计讲义(26):字符串的用法——字符的编码目录Python程序设计讲义(26):字符串的用法——字符的编码一、字符的编码二、`ASCII`编码三、`Unicode`编码四、使用`ord()`函数查询一个字符对应的`Unicode`编码五、使用`chr()`函数查询一个`Unicode`编码对应的字符六、`Python`字符串的特征一、字符的编码计算机默认只能处理二进制数,而不能处</div> </li> <li><a href="/article/1950183898780594176.htm" title="【Python】pypinyin-汉字拼音转换工具" target="_blank">【Python】pypinyin-汉字拼音转换工具</a> <span class="text-muted">鸟哥大大</span> <a class="tag" taget="_blank" href="/search/Python/1.htm">Python</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86/1.htm">自然语言处理</a> <div>文章目录1.主要功能2.安装3.常用API3.1拼音风格3.2核心API3.2.1pypinyin.pinyin()3.2.2pypinyin.lazy_pinyin()3.2.3pypinyin.load_single_dict()3.2.4pypinyin.load_phrases_dict()3.2.5pypinyin.slug()3.3注册新的拼音风格4.基本用法4.1库导入4.2基本汉字</div> </li> <li><a href="/article/1950183268448006144.htm" title="python编程第十四课:数据可视化" target="_blank">python编程第十四课:数据可视化</a> <span class="text-muted">小小源助手</span> <a class="tag" taget="_blank" href="/search/Python%E4%BB%A3%E7%A0%81%E5%AE%9E%E4%BE%8B/1.htm">Python代码实例</a><a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a> <div>Python数据可视化:让数据“开口说话”在当今数据爆炸的时代,数据可视化已成为探索数据规律、传达数据信息的关键技术。Python凭借其丰富的第三方库,为数据可视化提供了强大而灵活的解决方案。本文将带你深入了解Matplotlib库的基础绘图、Seaborn库的高级可视化以及交互式可视化工具Plotly,帮助你通过图表清晰地展示数据背后的故事。一、Matplotlib库基础绘图Matplotlib</div> </li> <li><a href="/article/1950181882679324672.htm" title="深入理解卷积神经网络(CNN)与循环神经网络(RNN)" target="_blank">深入理解卷积神经网络(CNN)与循环神经网络(RNN)</a> <span class="text-muted">CodeJourney.</span> <a class="tag" taget="_blank" href="/search/cnn/1.htm">cnn</a><a class="tag" taget="_blank" href="/search/rnn/1.htm">rnn</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a> <div>在当今的人工智能领域,神经网络无疑是最为璀璨的明珠之一。而卷积神经网络(ConvolutionalNeuralNetworks,CNN)和循环神经网络(RecurrentNeuralNetworks,RNN)作为神经网络家族中的重要成员,各自有着独特的架构和强大的功能,广泛应用于众多领域。本文将深入探讨这两种神经网络的原理、特点以及应用场景,为对深度学习感兴趣的读者提供全面的知识讲解。一、卷积神经</div> </li> <li><a href="/article/1950180118999658496.htm" title="Python数据可视化:用代码绘制数据背后的故事" target="_blank">Python数据可视化:用代码绘制数据背后的故事</a> <span class="text-muted">AAEllisonPang</span> <a class="tag" taget="_blank" href="/search/Python/1.htm">Python</a><a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a> <div>引言:当数据会说话在数据爆炸的时代,可视化是解锁数据价值的金钥匙。Python凭借其丰富的可视化生态库,已成为数据科学家的首选工具。本文将带您从基础到高级,探索如何用Python将冰冷数字转化为引人入胜的视觉叙事。一、基础篇:二维可视化的艺术表达1.1Matplotlib:可视化领域的瑞士军刀importmatplotlib.pyplotaspltimportnumpyasnpx=np.linsp</div> </li> <li><a href="/article/1950179614320029696.htm" title="python学习笔记(汇总)" target="_blank">python学习笔记(汇总)</a> <span class="text-muted">朕的剑还未配妥</span> <a class="tag" taget="_blank" href="/search/python%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0%E6%95%B4%E7%90%86/1.htm">python学习笔记整理</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%AD%A6%E4%B9%A0/1.htm">学习</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a> <div>文章目录一.基础知识二.python中的数据类型三.运算符四.程序的控制结构五.列表六.字典七.元组八.集合九.字符串十.函数十一.解决bug一.基础知识print函数字符串要加引号,数字可不加引号,如print(123.4)print('小谢')print("洛天依")还可输入表达式,如print(1+3)如果使用三引号,print打印的内容可不在同一行print("line1line2line</div> </li> <li><a href="/article/1950178605208236032.htm" title="时序预测 | MATLAB实现贝叶斯优化CNN-GRU时间序列预测(股票价格预测)" target="_blank">时序预测 | MATLAB实现贝叶斯优化CNN-GRU时间序列预测(股票价格预测)</a> <span class="text-muted">Matlab机器学习之心</span> <a class="tag" taget="_blank" href="/search/matlab/1.htm">matlab</a><a class="tag" taget="_blank" href="/search/cnn/1.htm">cnn</a><a class="tag" taget="_blank" href="/search/gru/1.htm">gru</a> <div>✅作者简介:热爱数据处理、数学建模、仿真设计、论文复现、算法创新的Matlab仿真开发者。更多Matlab代码及仿真咨询内容点击主页:Matlab科研工作室个人信条:格物致知,期刊达人。内容介绍股票价格预测一直是金融领域一个极具挑战性的课题。其内在的非线性、随机性和复杂性使得传统的预测方法难以取得令人满意的效果。近年来,深度学习技术,特别是卷积神经网络(CNN)和门控循环单元(GRU)的结合,为时</div> </li> <li><a href="/article/1950176082833502208.htm" title="时序预测 | MATLAB实现BO-CNN-GRU贝叶斯优化卷积门控循环单元时间序列预测" target="_blank">时序预测 | MATLAB实现BO-CNN-GRU贝叶斯优化卷积门控循环单元时间序列预测</a> <span class="text-muted">Matlab算法改进和仿真定制工程师</span> <a class="tag" taget="_blank" href="/search/matlab/1.htm">matlab</a><a class="tag" taget="_blank" href="/search/cnn/1.htm">cnn</a><a class="tag" taget="_blank" href="/search/gru/1.htm">gru</a> <div>✅作者简介:热爱数据处理、数学建模、算法创新的Matlab仿真开发者。更多Matlab代码及仿真咨询内容点击:Matlab科研工作室个人信条:格物致知。内容介绍时间序列预测在各个领域都具有重要的应用价值,例如金融市场预测、气象预报、交通流量预测等。准确地预测未来趋势对于决策制定至关重要。近年来,深度学习技术在时间序列预测领域取得了显著进展,其中卷积神经网络(CNN)和门控循环单元(GRU)由于其强</div> </li> <li><a href="/article/1950175199089455104.htm" title="PDF转Markdown - Python 实现方案与代码" target="_blank">PDF转Markdown - Python 实现方案与代码</a> <span class="text-muted">Eiceblue</span> <a class="tag" taget="_blank" href="/search/Python/1.htm">Python</a><a class="tag" taget="_blank" href="/search/Python/1.htm">Python</a><a class="tag" taget="_blank" href="/search/PDF/1.htm">PDF</a><a class="tag" taget="_blank" href="/search/pdf/1.htm">pdf</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/vscode/1.htm">vscode</a> <div>PDF作为广泛使用的文档格式,转换为轻量级标记语言Markdown后,可无缝集成到技术文档、博客平台和版本控制系统中,提高内容的可编辑性和可访问性。本文将详细介绍如何使用国产Spire.PDFforPython库将PDF文档转换为Markdown格式。技术优势:精准保留原始文档结构(段落/列表/表格)完整提取文本和图像内容无需Adobe依赖的纯Python实现支持Linux/Windows/mac</div> </li> <li><a href="/article/1950174441992417280.htm" title="使用Python和Gradio构建实时数据可视化工具" target="_blank">使用Python和Gradio构建实时数据可视化工具</a> <span class="text-muted">PythonAI编程架构实战家</span> <a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/ai/1.htm">ai</a> <div>使用Python和Gradio构建实时数据可视化工具关键词:Python、Gradio、数据可视化、实时数据、Web应用、交互式界面、数据科学摘要:本文将详细介绍如何使用Python和Gradio框架构建一个实时数据可视化工具。我们将从基础概念开始,逐步深入到核心算法实现,包括数据处理、可视化技术以及Gradio的交互式界面设计。通过实际项目案例,读者将学习如何创建一个功能完整、响应迅速的实时数据</div> </li> <li><a href="/article/1950174315609649152.htm" title="Python Gradio:实现交互式图像编辑" target="_blank">Python Gradio:实现交互式图像编辑</a> <span class="text-muted">PythonAI编程架构实战家</span> <a class="tag" taget="_blank" href="/search/Python%E7%BC%96%E7%A8%8B%E4%B9%8B%E9%81%93/1.htm">Python编程之道</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/ai/1.htm">ai</a> <div>PythonGradio:实现交互式图像编辑关键词:Python,Gradio,交互式图像编辑,计算机视觉,深度学习,图像处理,Web应用摘要:本文将深入探讨如何使用Python的Gradio库构建交互式图像编辑应用。我们将从基础概念开始,逐步介绍Gradio的核心功能,并通过实际代码示例展示如何实现各种图像处理功能。文章将涵盖图像滤镜应用、对象检测、风格迁移等高级功能,同时提供完整的项目实战案例</div> </li> <li><a href="/article/1950174063116742656.htm" title="数据可视化:数据世界的直观呈现" target="_blank">数据可视化:数据世界的直观呈现</a> <span class="text-muted">卢政权1</span> <a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/1.htm">数据分析</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E6%8C%96%E6%8E%98/1.htm">数据挖掘</a> <div>在当今数字化浪潮中,数据呈爆炸式增长。数据可视化作为一种强大的技术手段,能够将复杂的数据转化为直观的图形、图表等形式,让数据背后的信息一目了然。无论是在商业决策、科学研究还是日常数据分析中,数据可视化都发挥着极为重要的作用。它帮助我们快速理解数据的分布、趋势、关联等特征,从而为进一步的分析和行动提供有力支持。接下来,我们将深入探讨数据可视化的奥秘,并通过代码示例展示其实际应用。一、Python数据</div> </li> <li><a href="/article/1950172300749893632.htm" title="Python 程序设计讲义(25):循环结构——嵌套循环" target="_blank">Python 程序设计讲义(25):循环结构——嵌套循环</a> <span class="text-muted"></span> <div>Python程序设计讲义(25):循环结构——嵌套循环目录Python程序设计讲义(25):循环结构——嵌套循环一、嵌套循环的执行流程二、嵌套循环对应的几种情况1、内循环和外循环互不影响2、外循环迭代影响内循环的条件3、外循环迭代影响内循环的循环体嵌套循环是指在一个循环体中嵌套另一个循环。while循环中可以嵌入另一个while循环或for循环。反之,也可以在for循环中嵌入另一个for循环或wh</div> </li> <li><a href="/article/1950169145177862144.htm" title="如何运用深度学习打造高效AI人工智能系统" target="_blank">如何运用深度学习打造高效AI人工智能系统</a> <span class="text-muted">AI智能探索者</span> <a class="tag" taget="_blank" href="/search/AI/1.htm">AI</a><a class="tag" taget="_blank" href="/search/Agent/1.htm">Agent</a><a class="tag" taget="_blank" href="/search/%E6%99%BA%E8%83%BD%E4%BD%93%E5%BC%80%E5%8F%91%E5%AE%9E%E6%88%98/1.htm">智能体开发实战</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/1.htm">深度学习</a><a class="tag" taget="_blank" href="/search/ai/1.htm">ai</a> <div>如何运用深度学习打造高效AI人工智能系统关键词:深度学习、AI系统、神经网络、模型优化、实战开发摘要:本文将从深度学习的核心概念出发,结合生活实例和代码实战,系统讲解如何构建高效AI系统。我们会拆解数据准备、模型设计、训练优化、部署落地的全流程,揭秘“数据-模型-训练-推理”的协同机制,并通过具体案例演示从0到1开发AI系统的关键技巧,帮助开发者掌握打造高效AI系统的底层逻辑。背景介绍目的和范围在</div> </li> <li><a href="/article/128.htm" title="scala的option和some" target="_blank">scala的option和some</a> <span class="text-muted">矮蛋蛋</span> <a class="tag" taget="_blank" href="/search/%E7%BC%96%E7%A8%8B/1.htm">编程</a><a class="tag" taget="_blank" href="/search/scala/1.htm">scala</a> <div>原文地址: http://blog.sina.com.cn/s/blog_68af3f090100qkt8.html 对于学习 Scala 的 Java™ 开发人员来说,对象是一个比较自然、简单的入口点。在 本系列 前几期文章中,我介绍了 Scala 中一些面向对象的编程方法,这些方法实际上与 Java 编程的区别不是很大。我还向您展示了 Scala 如何重新应用传统的面向对象概念,找到其缺点</div> </li> <li><a href="/article/255.htm" title="NullPointerException" target="_blank">NullPointerException</a> <span class="text-muted">Cb123456</span> <a class="tag" taget="_blank" href="/search/android/1.htm">android</a><a class="tag" taget="_blank" href="/search/BaseAdapter/1.htm">BaseAdapter</a> <div>    java.lang.NullPointerException: Attempt to invoke virtual method 'int android.view.View.getImportantForAccessibility()' on a null object reference     出现以上异常.然后就在baidu上</div> </li> <li><a href="/article/382.htm" title="PHP使用文件和目录" target="_blank">PHP使用文件和目录</a> <span class="text-muted">天子之骄</span> <a class="tag" taget="_blank" href="/search/php%E6%96%87%E4%BB%B6%E5%92%8C%E7%9B%AE%E5%BD%95/1.htm">php文件和目录</a><a class="tag" taget="_blank" href="/search/%E8%AF%BB%E5%8F%96%E5%92%8C%E5%86%99%E5%85%A5/1.htm">读取和写入</a><a class="tag" taget="_blank" href="/search/php%E9%AA%8C%E8%AF%81%E6%96%87%E4%BB%B6/1.htm">php验证文件</a><a class="tag" taget="_blank" href="/search/php%E9%94%81%E5%AE%9A%E6%96%87%E4%BB%B6/1.htm">php锁定文件</a> <div>PHP使用文件和目录 1.使用include()包含文件 (1):使用include()从一个被包含文档返回一个值 (2):在控制结构中使用include()   include_once()函数需要一个包含文件的路径,此外,第一次调用它的情况和include()一样,如果在脚本执行中再次对同一个文件调用,那么这个文件不会再次包含。   在php.ini文件中设置</div> </li> <li><a href="/article/509.htm" title="SQL SELECT DISTINCT 语句" target="_blank">SQL SELECT DISTINCT 语句</a> <span class="text-muted">何必如此</span> <a class="tag" taget="_blank" href="/search/sql/1.htm">sql</a> <div>SELECT DISTINCT 语句用于返回唯一不同的值。 SQL SELECT DISTINCT 语句 在表中,一个列可能会包含多个重复值,有时您也许希望仅仅列出不同(distinct)的值。 DISTINCT 关键词用于返回唯一不同的值。 SQL SELECT DISTINCT 语法 SELECT DISTINCT column_name,column_name F</div> </li> <li><a href="/article/636.htm" title="java冒泡排序" target="_blank">java冒泡排序</a> <span class="text-muted">3213213333332132</span> <a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/%E5%86%92%E6%B3%A1%E6%8E%92%E5%BA%8F/1.htm">冒泡排序</a> <div>package com.algorithm; /** * @Description 冒泡 * @author FuJianyong * 2015-1-22上午09:58:39 */ public class MaoPao { public static void main(String[] args) { int[] mao = {17,50,26,18,9,10</div> </li> <li><a href="/article/763.htm" title="struts2.18 +json,struts2-json-plugin-2.1.8.1.jar配置及问题!" target="_blank">struts2.18 +json,struts2-json-plugin-2.1.8.1.jar配置及问题!</a> <span class="text-muted">7454103</span> <a class="tag" taget="_blank" href="/search/DAO/1.htm">DAO</a><a class="tag" taget="_blank" href="/search/spring/1.htm">spring</a><a class="tag" taget="_blank" href="/search/Ajax/1.htm">Ajax</a><a class="tag" taget="_blank" href="/search/json/1.htm">json</a><a class="tag" taget="_blank" href="/search/qq/1.htm">qq</a> <div>struts2.18  出来有段时间了! (貌似是 稳定版)   闲时研究下下!  貌似 sruts2 搭配 json 做 ajax 很吃香!   实践了下下! 不当之处请绕过! 呵呵   网上一大堆 struts2+json  不过大多的json 插件 都是 jsonplugin.34.jar   strut</div> </li> <li><a href="/article/890.htm" title="struts2 数据标签说明" target="_blank">struts2 数据标签说明</a> <span class="text-muted">darkranger</span> <a class="tag" taget="_blank" href="/search/jsp/1.htm">jsp</a><a class="tag" taget="_blank" href="/search/bean/1.htm">bean</a><a class="tag" taget="_blank" href="/search/struts/1.htm">struts</a><a class="tag" taget="_blank" href="/search/servlet/1.htm">servlet</a><a class="tag" taget="_blank" href="/search/Scheme/1.htm">Scheme</a> <div>数据标签主要用于提供各种数据访问相关的功能,包括显示一个Action里的属性,以及生成国际化输出等功能 数据标签主要包括: action :该标签用于在JSP页面中直接调用一个Action,通过指定executeResult参数,还可将该Action的处理结果包含到本页面来。 bean :该标签用于创建一个javabean实例。如果指定了id属性,则可以将创建的javabean实例放入Sta</div> </li> <li><a href="/article/1017.htm" title="链表.简单的链表节点构建" target="_blank">链表.简单的链表节点构建</a> <span class="text-muted">aijuans</span> <a class="tag" taget="_blank" href="/search/%E7%BC%96%E7%A8%8B%E6%8A%80%E5%B7%A7/1.htm">编程技巧</a> <div>/*编程环境WIN-TC*/ #include "stdio.h" #include "conio.h" #define NODE(name, key_word, help) \  Node name[1]={{NULL, NULL, NULL, key_word, help}} typedef struct node {  &nbs</div> </li> <li><a href="/article/1144.htm" title="tomcat下jndi的三种配置方式" target="_blank">tomcat下jndi的三种配置方式</a> <span class="text-muted">avords</span> <a class="tag" taget="_blank" href="/search/tomcat/1.htm">tomcat</a> <div>jndi(Java Naming and Directory Interface,Java命名和目录接口)是一组在Java应用中访问命名和目录服务的API。命名服务将名称和对象联系起来,使得我们可以用名称 访问对象。目录服务是一种命名服务,在这种服务里,对象不但有名称,还有属性。          tomcat配置</div> </li> <li><a href="/article/1271.htm" title="关于敏捷的一些想法" target="_blank">关于敏捷的一些想法</a> <span class="text-muted">houxinyou</span> <a class="tag" taget="_blank" href="/search/%E6%95%8F%E6%8D%B7/1.htm">敏捷</a> <div>从网上看到这样一句话:“敏捷开发的最重要目标就是:满足用户多变的需求,说白了就是最大程度的让客户满意。” 感觉表达的不太清楚。 感觉容易被人误解的地方主要在“用户多变的需求”上。 第一种多变,实际上就是没有从根本上了解了用户的需求。用户的需求实际是稳定的,只是比较多,也比较混乱,用户一般只能了解自己的那一小部分,所以没有用户能清楚的表达出整体需求。而由于各种条件的,用户表达自己那一部分时也有</div> </li> <li><a href="/article/1398.htm" title="富养还是穷养,决定孩子的一生" target="_blank">富养还是穷养,决定孩子的一生</a> <span class="text-muted">bijian1013</span> <a class="tag" taget="_blank" href="/search/%E6%95%99%E8%82%B2/1.htm">教育</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E7%94%9F/1.htm">人生</a> <div> 是什么决定孩子未来物质能否丰盛?为什么说寒门很难出贵子,三代才能出贵族?真的是父母必须有钱,才能大概率保证孩子未来富有吗?-----作者:@李雪爱与自由 事实并非由物质决定,而是由心灵决定。一朋友富有而且修养气质很好,兄弟姐妹也都如此。她的童年时代,物质上大家都很贫乏,但妈妈总是保持生活中的美感,时不时给孩子们带回一些美好小玩意,从来不对孩子传递生活艰辛、金钱来之不易、要懂得珍惜</div> </li> <li><a href="/article/1525.htm" title="oracle 日期时间格式转化" target="_blank">oracle 日期时间格式转化</a> <span class="text-muted">征客丶</span> <a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a> <div>oracle 系统时间有 SYSDATE 与 SYSTIMESTAMP; SYSDATE:不支持毫秒,取的是系统时间; SYSTIMESTAMP:支持毫秒,日期,时间是给时区转换的,秒和毫秒是取的系统的。 日期转字符窜: 一、不取毫秒: TO_CHAR(SYSDATE, 'YYYY-MM-DD HH24:MI:SS') 简要说明, YYYY 年 MM   月</div> </li> <li><a href="/article/1652.htm" title="【Scala六】分析Spark源代码总结的Scala语法四" target="_blank">【Scala六】分析Spark源代码总结的Scala语法四</a> <span class="text-muted">bit1129</span> <a class="tag" taget="_blank" href="/search/scala/1.htm">scala</a> <div>1. apply语法   FileShuffleBlockManager中定义的类ShuffleFileGroup,定义:   private class ShuffleFileGroup(val shuffleId: Int, val fileId: Int, val files: Array[File]) { ... def apply(bucketId</div> </li> <li><a href="/article/1779.htm" title="Erlang中有意思的bug" target="_blank">Erlang中有意思的bug</a> <span class="text-muted">bookjovi</span> <a class="tag" taget="_blank" href="/search/erlang/1.htm">erlang</a> <div>  代码中常有一些很搞笑的bug,如下面的一行代码被调用两次(Erlang beam) commit f667e4a47b07b07ed035073b94d699ff5fe0ba9b Author: Jovi Zhang <bookjovi@gmail.com> Date: Fri Dec 2 16:19:22 2011 +0100 erts:</div> </li> <li><a href="/article/1906.htm" title="移位打印10进制数转16进制-2008-08-18" target="_blank">移位打印10进制数转16进制-2008-08-18</a> <span class="text-muted">ljy325</span> <a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/%E5%9F%BA%E7%A1%80/1.htm">基础</a> <div> /** * Description 移位打印10进制的16进制形式 * Creation Date 15-08-2008 9:00 * @author 卢俊宇 * @version 1.0 * */ public class PrintHex { // 备选字符 static final char di</div> </li> <li><a href="/article/2033.htm" title="读《研磨设计模式》-代码笔记-组合模式" target="_blank">读《研磨设计模式》-代码笔记-组合模式</a> <span class="text-muted">bylijinnan</span> <a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/1.htm">设计模式</a> <div>声明: 本文只为方便我个人查阅和理解,详细的分析以及源代码请移步 原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.List; abstract class Component { public abstract void printStruct(Str</div> </li> <li><a href="/article/2160.htm" title="利用cmd命令将.class文件打包成jar" target="_blank">利用cmd命令将.class文件打包成jar</a> <span class="text-muted">chenyu19891124</span> <a class="tag" taget="_blank" href="/search/cmd/1.htm">cmd</a><a class="tag" taget="_blank" href="/search/jar/1.htm">jar</a> <div>cmd命令打jar是如下实现: 在运行里输入cmd,利用cmd命令进入到本地的工作盘符。(如我的是D盘下的文件有此路径 D:\workspace\prpall\WEB-INF\classes) 现在是想把D:\workspace\prpall\WEB-INF\classes路径下所有的文件打包成prpall.jar。然后继续如下操作: cd D: 回车 cd workspace/prpal</div> </li> <li><a href="/article/2287.htm" title="[原创]JWFD v0.96 工作流系统二次开发包 for Eclipse 简要说明" target="_blank">[原创]JWFD v0.96 工作流系统二次开发包 for Eclipse 简要说明</a> <span class="text-muted">comsci</span> <a class="tag" taget="_blank" href="/search/eclipse/1.htm">eclipse</a><a class="tag" taget="_blank" href="/search/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/1.htm">设计模式</a><a class="tag" taget="_blank" href="/search/%E7%AE%97%E6%B3%95/1.htm">算法</a><a class="tag" taget="_blank" href="/search/%E5%B7%A5%E4%BD%9C/1.htm">工作</a><a class="tag" taget="_blank" href="/search/swing/1.htm">swing</a> <div>                       JWFD v0.96 工作流系统二次开发包 for Eclipse 简要说明     &nb</div> </li> <li><a href="/article/2414.htm" title="SecureCRT右键粘贴的设置" target="_blank">SecureCRT右键粘贴的设置</a> <span class="text-muted">daizj</span> <a class="tag" taget="_blank" href="/search/secureCRT/1.htm">secureCRT</a><a class="tag" taget="_blank" href="/search/%E5%8F%B3%E9%94%AE/1.htm">右键</a><a class="tag" taget="_blank" href="/search/%E7%B2%98%E8%B4%B4/1.htm">粘贴</a> <div>一般都习惯鼠标右键自动粘贴的功能,对于SecureCRT6.7.5 ,这个功能也已经是默认配置了。 老版本的SecureCRT其实也有这个功能,只是不是默认设置,很多人不知道罢了。 菜单: Options->Global Options ...->Terminal 右边有个Mouse的选项块。 Copy on Select Paste on Right/Middle</div> </li> <li><a href="/article/2541.htm" title="Linux 软链接和硬链接" target="_blank">Linux 软链接和硬链接</a> <span class="text-muted">dongwei_6688</span> <a class="tag" taget="_blank" href="/search/linux/1.htm">linux</a> <div>1.Linux链接概念Linux链接分两种,一种被称为硬链接(Hard Link),另一种被称为符号链接(Symbolic Link)。默认情况下,ln命令产生硬链接。 【硬连接】硬连接指通过索引节点来进行连接。在Linux的文件系统中,保存在磁盘分区中的文件不管是什么类型都给它分配一个编号,称为索引节点号(Inode Index)。在Linux中,多个文件名指向同一索引节点是存在的。一般这种连</div> </li> <li><a href="/article/2668.htm" title="DIV底部自适应" target="_blank">DIV底部自适应</a> <span class="text-muted">dcj3sjt126com</span> <a class="tag" taget="_blank" href="/search/JavaScript/1.htm">JavaScript</a> <div><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&q</div> </li> <li><a href="/article/2795.htm" title="Centos6.5使用yum安装mysql——快速上手必备" target="_blank">Centos6.5使用yum安装mysql——快速上手必备</a> <span class="text-muted">dcj3sjt126com</span> <a class="tag" taget="_blank" href="/search/mysql/1.htm">mysql</a> <div>第1步、yum安装mysql [root@stonex ~]#  yum -y install mysql-server 安装结果: Installed:     mysql-server.x86_64 0:5.1.73-3.el6_5                   &nb</div> </li> <li><a href="/article/2922.htm" title="如何调试JDK源码" target="_blank">如何调试JDK源码</a> <span class="text-muted">frank1234</span> <a class="tag" taget="_blank" href="/search/jdk/1.htm">jdk</a> <div>相信各位小伙伴们跟我一样,想通过JDK源码来学习Java,比如collections包,java.util.concurrent包。 可惜的是sun提供的jdk并不能查看运行中的局部变量,需要重新编译一下rt.jar。 下面是编译jdk的具体步骤:         1.把C:\java\jdk1.6.0_26\sr</div> </li> <li><a href="/article/3049.htm" title="Maximal Rectangle" target="_blank">Maximal Rectangle</a> <span class="text-muted">hcx2013</span> <a class="tag" taget="_blank" href="/search/max/1.htm">max</a> <div>Given a 2D binary matrix filled with 0's and 1's, find the largest rectangle containing all ones and return its area.   public class Solution { public int maximalRectangle(char[][] matrix)</div> </li> <li><a href="/article/3176.htm" title="Spring MVC测试框架详解——服务端测试" target="_blank">Spring MVC测试框架详解——服务端测试</a> <span class="text-muted">jinnianshilongnian</span> <a class="tag" taget="_blank" href="/search/spring+mvc+test/1.htm">spring mvc test</a> <div>随着RESTful Web Service的流行,测试对外的Service是否满足期望也变的必要的。从Spring 3.2开始Spring了Spring Web测试框架,如果版本低于3.2,请使用spring-test-mvc项目(合并到spring3.2中了)。   Spring MVC测试框架提供了对服务器端和客户端(基于RestTemplate的客户端)提供了支持。 &nbs</div> </li> <li><a href="/article/3303.htm" title="Linux64位操作系统(CentOS6.6)上如何编译hadoop2.4.0" target="_blank">Linux64位操作系统(CentOS6.6)上如何编译hadoop2.4.0</a> <span class="text-muted">liyong0802</span> <a class="tag" taget="_blank" href="/search/hadoop/1.htm">hadoop</a> <div>一、准备编译软件   1.在官网下载jdk1.7、maven3.2.1、ant1.9.4,解压设置好环境变量就可以用。     环境变量设置如下:   (1)执行vim /etc/profile (2)在文件尾部加入: export JAVA_HOME=/home/spark/jdk1.7 export MAVEN_HOME=/ho</div> </li> <li><a href="/article/3430.htm" title="StatusBar 字体白色" target="_blank">StatusBar 字体白色</a> <span class="text-muted">pangyulei</span> <a class="tag" taget="_blank" href="/search/status/1.htm">status</a> <div> [[UIApplication sharedApplication] setStatusBarStyle:UIStatusBarStyleLightContent]; /*you'll also need to set UIViewControllerBasedStatusBarAppearance to NO in the plist file if you use this method</div> </li> <li><a href="/article/3557.htm" title="如何分析Java虚拟机死锁" target="_blank">如何分析Java虚拟机死锁</a> <span class="text-muted">sesame</span> <a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/thread/1.htm">thread</a><a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E8%99%9A%E6%8B%9F%E6%9C%BA/1.htm">虚拟机</a><a class="tag" taget="_blank" href="/search/jdbc/1.htm">jdbc</a> <div>英文资料: Thread Dump and Concurrency Locks   Thread dumps are very useful for diagnosing synchronization related problems such as deadlocks on object monitors. Ctrl-\ on Solaris/Linux or Ctrl-B</div> </li> <li><a href="/article/3684.htm" title="位运算简介及实用技巧(一):基础篇" target="_blank">位运算简介及实用技巧(一):基础篇</a> <span class="text-muted">tw_wangzhengquan</span> <a class="tag" taget="_blank" href="/search/%E4%BD%8D%E8%BF%90%E7%AE%97/1.htm">位运算</a> <div>http://www.matrix67.com/blog/archives/263    去年年底写的关于位运算的日志是这个Blog里少数大受欢迎的文章之一,很多人都希望我能不断完善那篇文章。后来我看到了不少其它的资料,学习到了更多关于位运算的知识,有了重新整理位运算技巧的想法。从今天起我就开始写这一系列位运算讲解文章,与其说是原来那篇文章的follow-up,不如说是一个r</div> </li> <li><a href="/article/3811.htm" title="jsearch的索引文件结构" target="_blank">jsearch的索引文件结构</a> <span class="text-muted">yangshangchuan</span> <a class="tag" taget="_blank" href="/search/%E6%90%9C%E7%B4%A2%E5%BC%95%E6%93%8E/1.htm">搜索引擎</a><a class="tag" taget="_blank" href="/search/jsearch/1.htm">jsearch</a><a class="tag" taget="_blank" href="/search/%E5%85%A8%E6%96%87%E6%A3%80%E7%B4%A2/1.htm">全文检索</a><a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E6%A3%80%E7%B4%A2/1.htm">信息检索</a><a class="tag" taget="_blank" href="/search/word%E5%88%86%E8%AF%8D/1.htm">word分词</a> <div>jsearch是一个高性能的全文检索工具包,基于倒排索引,基于java8,类似于lucene,但更轻量级。   jsearch的索引文件结构定义如下:     1、一个词的索引由=分割的三部分组成:        第一部分是词        第二部分是这个词在多少</div> </li> </ul> </div> </div> </div> <div> <div class="container"> <div class="indexes"> <strong>按字母分类:</strong> <a href="/tags/A/1.htm" target="_blank">A</a><a href="/tags/B/1.htm" target="_blank">B</a><a href="/tags/C/1.htm" target="_blank">C</a><a href="/tags/D/1.htm" target="_blank">D</a><a href="/tags/E/1.htm" target="_blank">E</a><a href="/tags/F/1.htm" target="_blank">F</a><a href="/tags/G/1.htm" target="_blank">G</a><a href="/tags/H/1.htm" target="_blank">H</a><a href="/tags/I/1.htm" target="_blank">I</a><a href="/tags/J/1.htm" target="_blank">J</a><a href="/tags/K/1.htm" target="_blank">K</a><a href="/tags/L/1.htm" target="_blank">L</a><a href="/tags/M/1.htm" target="_blank">M</a><a href="/tags/N/1.htm" target="_blank">N</a><a href="/tags/O/1.htm" target="_blank">O</a><a href="/tags/P/1.htm" target="_blank">P</a><a href="/tags/Q/1.htm" target="_blank">Q</a><a href="/tags/R/1.htm" target="_blank">R</a><a href="/tags/S/1.htm" target="_blank">S</a><a href="/tags/T/1.htm" target="_blank">T</a><a href="/tags/U/1.htm" target="_blank">U</a><a href="/tags/V/1.htm" target="_blank">V</a><a href="/tags/W/1.htm" target="_blank">W</a><a href="/tags/X/1.htm" target="_blank">X</a><a href="/tags/Y/1.htm" target="_blank">Y</a><a href="/tags/Z/1.htm" target="_blank">Z</a><a href="/tags/0/1.htm" target="_blank">其他</a> </div> </div> </div> <footer id="footer" class="mb30 mt30"> <div class="container"> <div class="footBglm"> <a target="_blank" href="/">首页</a> - <a target="_blank" href="/custom/about.htm">关于我们</a> - <a target="_blank" href="/search/Java/1.htm">站内搜索</a> - <a target="_blank" href="/sitemap.txt">Sitemap</a> - <a target="_blank" href="/custom/delete.htm">侵权投诉</a> </div> <div class="copyright">版权所有 IT知识库 CopyRight © 2000-2050 E-COM-NET.COM , All Rights Reserved. <!-- <a href="https://beian.miit.gov.cn/" rel="nofollow" target="_blank">京ICP备09083238号</a><br>--> </div> </div> </footer> <!-- 代码高亮 --> <script type="text/javascript" src="/static/syntaxhighlighter/scripts/shCore.js"></script> <script type="text/javascript" src="/static/syntaxhighlighter/scripts/shLegacy.js"></script> <script type="text/javascript" src="/static/syntaxhighlighter/scripts/shAutoloader.js"></script> <link type="text/css" rel="stylesheet" href="/static/syntaxhighlighter/styles/shCoreDefault.css"/> <script type="text/javascript" src="/static/syntaxhighlighter/src/my_start_1.js"></script> </body> </html>