python爬虫一键爬下整页美女图片

2020-06-28 13:52:28 来源：易采站长站作者：易采站长站整理

最近大四闲在家里特别无聊，毕业设计也想不出做啥，无聊泡论坛的时候发现自己没怎么做过爬虫啊，做几个爬虫练练手

既然做爬虫，就爬点有意思的东西，于是随便找了个网站爬一爬
在这里插入图片描述
这个网站结构还算简单网址直接是index_12345.html
直接做个循环就可以爬下所有的网址

捋一下思路

访问主页
获取图片详情页丢到线程里跑一下保存，然后自动获取每一套图的下一页循环保存
存的时候做一下判断有没有重复的图片，有就直接取消了，顺便把文件夹名字改为详情页的标题好分类
运行结果
先跑个线程把第一页爬了

import threading  # 导入threading模块
from queue import Queue  # 导入queue模块
import time  # 导入time模块
import requests
import os
from lxml import etree as et
#请求头
headers = {
    #用户代理
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'
}
# 待抓取网页基地址
base_url = 'http://www.***.com/mnmm/index.html'
# 保存图片基本路径
base_dir = 'E:/DateCenter/个人项目/图/'
# 保存图片
def savePic(pic_url,img_dir):
    file_name = base_dir + img_dir+'/'+pic_url.split('/')[-1]    # 如果目录不存在，则新建，如果文件存在，则跳出
    if not os.path.exists(base_dir + img_dir):
        os.makedirs(base_dir + img_dir)
    elif os.path.exists(file_name):
        return
    # 获取图片内容
    response = requests.get(pic_url, headers=headers)
    # 写入图片
    with open(file_name, 'wb') as fp:
        for data in response.iter_content(1024):
            fp.write(data)
# 循环爬取一套图片地址并存储
def get_detail_queue(detail_url):
    # Queue队列的put方法用于向Queue队列中放置元素，由于Queue是先进先出队列，所以先被Put的URL也就会被先get出来。
    detail_url_list=Queue(maxsize=50)
    detail_url_list.put(detail_url)
    while not detail_url_list.empty():
        url = detail_url_list.get()  # Queue队列的get方法用于从队列中提取元素
        img_rq = requests.get(url=url, headers=headers)
        # 请求状态码
        code =img_rq.status_code
        if code == 200:
            html = et.HTML(img_rq.text)
            # 获取页面所有图片地址
 1/2    1 2 下一页 尾页


			  
            热点聚合:
         图片   爬虫   下一页   线程   队列


          
          
暂时禁止评论


        
                 
        
          最新图文推荐
          
             123 
            
              
               
                  
                  Python ArcPy实现批量拼接长时间序列栅
                
               
                  
                  Python 中OS module的使用详解
                
               
                  
                  Python Matplotlib基本用法详解
                
               
                  
                  Python range() 函数用法详解
                
               
                  
                  Python分割单词和转换命名法的实现
                
               
                  
                  Python 中OS module的使用详解
                
               
                  
                  使用Pytorch构建第一个神经网络模型 附
                
               
                  
                  Python实现关键路径和七格图计算详解
                
               
                  
                  python3中SQLMap安装教程
                
               
                  
                  kali最新国内更新源sources
                
               
                  
                  详解Python中数据类型的转换
                
               
                  
                  Python实现对中文文本分段分句


      
        

	   

						
							最新专栏文章
							
								
						
										
											 
										
										
											
												
													Python ArcPy实现批量拼接长时间序列栅格图像
2023-03-16
												
											
										
									
						
										
											 
										
										
											
												
													Python 中OS module的使用详解
2023-03-16
												
											
										
									
						
										
											 
										
										
											
												
													Python Matplotlib基本用法详解
2023-03-16
												
											
										
									
								
							
							
					
					
 
					
						
							大家感兴趣的内容
							
1Python ArcPy实现批量拼接长时间
                
2Python 中OS module的使用详解
                
3Python Matplotlib基本用法详解
                
4Python range() 函数用法详解
                
5Python分割单词和转换命名法的实
                
6Python 中OS module的使用详解
                
7使用Pytorch构建第一个神经网络
                
8Python实现关键路径和七格图计算
                
9python3中SQLMap安装教程
                
10kali最新国内更新源sources
                

						
					
					
						
							网友热评的文章
								
1Python ArcPy实现批量拼接长时间
                
2Python 中OS module的使用详解
                
3Python Matplotlib基本用法详解
                
4Python range() 函数用法详解
                
5Python分割单词和转换命名法的实
                
6Python 中OS module的使用详解
                
7使用Pytorch构建第一个神经网络
                
8Python实现关键路径和七格图计算
                
9python3中SQLMap安装教程
                
10kali最新国内更新源sources

python爬虫一键爬下整页美女图片

最新图文推荐

最新专栏文章

Python ArcPy实现批量拼接长时间序列栅格图像

Python 中OS module的使用详解

Python Matplotlib基本用法详解

大家感兴趣的内容

网友热评的文章

微信扫一扫