欢迎光临 - 我的站长站,本站所有资源仅供学习与参考,禁止用于商业用途或从事违法行为!

python教程

百度图库python批量爬取下载代码

python教程 我的站长站 2022-07-27 共58人阅读
# @风清扬(fqy2022)
import requests
import time
import os
# 创建保存文件夹
if os.path.isdir(r'./保存'):
    print('已存在文件夹!')
else:
    os.mkdir('./保存')
    print('已为您创建文件夹!')
 
class Image(object):
    def __init__(self):
        # URL
        self.url = 'https://image.baidu.com/search/acjson?'
        # 拼接headers
        self.headers = {
            'Cookie': 'BDqhfp=%E7%8B%97%26%260-10-1undefined%26%260%26%261; BIDUPSID=A063B6D6CC13957DA917CAA433A26251; PSTM=1583301079; MCITY=-315%3A; BDUSS=TBSSlRRQU9QbmR-MGt6NUFQa01iR3VQWHBUbnNacW9zMnJUN0N-QndGSzNkMkJnSVFBQUFBJCQAAAAAAAAAAAEAAADuVM9dw~vX1tPQybbIobXEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAALfqOGC36jhgS; BDUSS_BFESS=TBSSlRRQU9QbmR-MGt6NUFQa01iR3VQWHBUbnNacW9zMnJUN0N-QndGSzNkMkJnSVFBQUFBJCQAAAAAAAAAAAEAAADuVM9dw~vX1tPQybbIobXEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAALfqOGC36jhgS; BAIDUID=857FDC525D72D7899014BED3AB7A9EFF:FG=1; __yjs_duid=1_bd666ba46de51678e9fb98774eb68df71616750528301; BDORZ=FFFB88E999055A3F8A630C64834BD6D0; BDSFRCVID_BFESS=0X0OJeCmHlQJPareecEsuUw4D2KK0gOTHllnm4-TLeKNvakVJeC6EG0Ptf8g0KubFTPRogKK0gOTH6KF_2uxOjjg8UtVJeC6EG0Ptf8g0M5; H_BDCLCKID_SF_BFESS=fRkfoKPKfCv8qTrmbtOhq4tHePPLexRZ5mAqoJIXQCjvDR5eD4TD3J-0jhbhtPvLtnTnaIQhtqQnqnQTXPoYBpku5bOR2f743bRT2MKy5KJvfj6gjj7qhP-UyPkHWh37aGOlMKoaMp78jR093JO4y4Ldj4oxJpOJ5JbMonLafD_bhD-4Djt2eP00-xQja--XKKj2WROeajrjDnCrDhA2XUI8LUc72poZLI6H0R5J34OhSt0mQ55vyT8sXnO72P7XaRPL-pRHWhr-HJvKy4oTjxL1Db3JKjvMtg3t3qQmLUooepvoD-Jc3MvByPjdJJQOBKQB0KnGbUQkeq8CQft20b0EeMtjW6LEK5r2SCDMtC0b3D; indexPageSugList=%5B%22%E7%8B%97%22%2C%22%E4%BA%8C%E5%93%88%22%2C%22%E9%87%87%E8%80%B3%E5%9B%BE%E7%89%87%20%E5%94%AF%E7%BE%8E%22%2C%22%E9%87%87%E8%80%B3%E5%9B%BE%E7%89%87%E9%AB%98%E6%B8%85%22%2C%22%E9%87%87%E8%80%B3%E5%AE%A3%E4%BC%A0%E5%9B%BE%E7%89%87%22%2C%22%E9%87%87%E8%80%B3%22%2C%22%E5%96%9D%E5%80%92%E4%BA%86%E7%9A%84%E8%A1%A8%E6%83%85%E5%8C%85%22%2C%22%E8%A5%BF%E6%B8%B8%E8%AE%B0%20%E8%AF%8D%E4%BA%91%22%2C%22%E5%AD%99%E6%82%9F%E7%A9%BA%20%E8%AF%8D%E4%BA%91%22%5D; delPer=0; PSINO=7; BDRCVFR[dG2JNJb_ajR]=mk3SLVN4HKm; BDRCVFR[-pGxjrCMryR]=mk3SLVN4HKm; BDRCVFR[EJrvrN3l0S0]=pDgu-4B3j7tIZ-EIy7GQhPEUf; H_PS_PSSID=; BDRCVFR[X_XKQks0S63]=mk3SLVN4HKm; firstShowTip=1; ZD_ENTRY=baidu; cleanHistoryStatus=0; BA_HECTOR=a401010ka584240lm51g6r0320r; userFrom=www.baidu.com; ab_sr=1.0.0_YjAxODJmMjA1MDU3YTUyZjIyMzk2MGQ4YjM3MTQ5OGNjNDI5NWFkNjkxOTA0YjkxMDBlYjY0Y2JmMDU5NzY5MDY1NDAxZDY0ZDhhYjUzZDhkNGY4ZDUwOWVhMzkwMGMxYzQ5OTA1MjE3OTViYzZmN2QxNzMyN2M2ZjYxMzBkYTE=',
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
        }
        self.params = {
            'tn': 'resultjson_com',
            'logid': '11625870838566749778',
            'ipn': 'rj',
            'ct': '201326592',
            'is': '',
            'fp': 'result',
            'queryWord': '',
            'cl': '2',
            'lm': '-1',
            'ie': 'utf-8',
            'oe': 'utf-8',
            'adpicid': '',
            'st': '-1',
            'z': '',
            'ic': '0',
            'hd': '',
            'latest': '',
            'copyright': '',
            'word': '',
            's': '',
            'se': '',
            'tab': '',
            'width': '',
            'height': '',
            'face': '0',
            'istype': '2',
            'qc': '',
            'nc': '1',
            'fr': '',
            'expermode': '',
            'force': '',
            'pn': '',
            'rn': '30',
            'gsm': '',
            'time': ''
        }
        self.image_list = []
        a = input('请输入要爬取的图片名称:')
        self.params['queryWord'] = a
        self.params['word'] = a
    def get_image(self, num):
        for i in range(0, num):
            self.params['time'] = int(time.time() * 1000)
            self.params['pn'] = i * 30
            response = requests.get(url=self.url, headers=self.headers, params=self.params)
            for j in range(0, len(response.json()['data']) - 1):
                self.image_list.append(response.json()['data'][j]['thumbURL'])
    # 图片保存函数
    def save_image(self):
        n = 1
        for i in self.image_list:
            image = requests.get(url=i)
            print('正在下载第{}张'.format(n))
            with open('./保存/{}.jpg'.format(n), 'wb') as f:
                f.write(image.content)
            n += 1
 
 
if __name__ == '__main__':
    c = int(input('请输入要爬取的页数(每页有30张图片):'))
    image = Image()
    image.get_image(c)
    image.save_image()


标签 Python爬取
相关推荐
  • Python爬取
  • Python爬取豆瓣电影top250排行榜

    Python爬取豆瓣电影top250排行榜示例代码,用的parsel和re两个模块,代码如下:import requestsimport csvimport reimport parselwith open("豆瓣top250.csv",mode="w",encoding="utf_8_sig",newline='') as f: csv_writer = csv.writer(f) ...

    python教程 40 1年前
  • 百度图库python批量爬取下载代码

    # @风清扬(fqy2022)import requestsimport timeimport os# 创建保存文件夹if os.path.isdir(r'./保存'): print('已存在文件夹!')else: os.mkdir('./保存') print('已为您创建文件夹!') class Image(object)...

    python教程 58 1年前
  • Python平台热搜热文爬取代码

    前言分享一段Python爬取各大平台热搜热文信息,支持微博热搜、抖音热搜、百度实时热点、知乎热榜、虎嗅热文、哔哩哔哩全站排行、豆瓣新片,免去一个一个网站的看了,是站长编辑的福音。提示:此代码为Python代码,需要有一点基础才能运行,如果是才能,我的站长站...

    python教程 44 2年前
  • 获取免费的https代理Python代码

    前言大家用Python爬网页时候,爬快了被封IP,爬慢了,等的着急,这时候就需要https代理来切换IP了。分享一段获取免费的https代理Python代码,可以快速获取网络上免费的https代理。Python代码from multiprocessing.dummy import Lockimport reimport requestsi...

    python教程 90 2年前
  • Python爬取知乎内容脚本

    题主的数据科学导论作业,关于舆情分析负责信息爬取。可能会对大家有点帮助,如果有哪写的不太好的地方,希望可以告诉我如果不想看,直接用的话把js代码命名为 g_encrypt.js 和python代码放在同一级目录就可以了(要搭建nodejs环境,具体可以参考Nodejs安装及环...

    python教程 115 2年前