首页 > 编程 > Python > 正文

python爬虫 批量下载zabbix文档代码实例

2019-11-25 11:54:45
字体:
来源:转载
供稿:网友

这篇文章主要介绍了python爬虫 批量下载zabbix文档代码实例,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下

# -*- coding: UTF-8 -*-import requests,re,timeurl = 'https://www.zabbix.com/documentation/3.4/zh/manual'base_url = 'https://www.zabbix.com/documentation/3.4/'seconds = 1err_url = []def get_urls():  res = requests.get(url)  content = res.text  pattern = re.compile(r"indexmenu_4848130395ca30b274d8bd.add[(]'(zh/manual.*?)[']", re.S)  routes = pattern.findall(content)  urls = [base_url+item for item in routes]  return urls def download(url):  download_url = url + "?do=export_pdf"  print("当前下载url:")  print(download_url)  res = requests.get(url)  if res.status_code == 200 :    pattern = re.compile(r"<title>(.*?)</title>", re.S)    title = pattern.findall(res.text)[0].encode("utf-8")    try:      filename = title.replace('//','-').replace('/','-').replace('"','-').replace('*','-').replace('?','-').replace(':','-').replace('<','-').replace('>','-').replace('|','-')    except Exception:       title = pattern.findall(res.text)[0]    filename = title.replace('//','-').replace('/','-').replace('"','-').replace('*','-').replace('?','-').replace(':','-').replace('<','-').replace('>','-').replace('|','-')    file = filename + '.pdf'    res = requests.get(download_url)    if res.status_code == 200 :      with open(file,"wb") as f:        f.write(res.content)      print('下载成功')    else:      print('下载失败')      err_url.append(download_url)  else:    print('获取文件名失败,停止当前下载')    err_url.append(download_url) def downloads(urls):  for url in urls:    download(url)    time.sleep( seconds )  if len(err_url) :    print("下载失败的URL:")    print(err_url) def main():  print("下载开始")  urls = get_urls()  downloads(urls)  print("下载完成") if __name__ == '__main__':  main()

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持武林网。

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表