使用python爬取猫眼电影、房王、股吧论坛、百度翻译、有道翻译、高德天气、华夏基金、扇贝单词、糗事百科（股吧论坛）

xiaoxiao2025-07-13 43

''' 翻页获取股吧数据 http://guba.eastmoney.com/ 获取10页信息，然后放到指定文件夹中 ''' ''' 爬取板块：国产芯片思路：找规律第一页：http://so.eastmoney.com/web/s?keyword=国产芯片第二页：http://so.eastmoney.com/web/s?keyword=国产芯片&pageindex=2 第三页：http://so.eastmoney.com/web/s?keyword=国产芯片&pageindex=3 ''' import requests,os def guba(pageindex): base_url = 'http://so.eastmoney.com/web/s?' # base_url = 'http://so.eastmoney.com/web/s?keyword=国产芯片&pageindex=4' params = { 'keyword': '国产芯片', } path = './guba/'+pageindex+'/' if not os.path.exists(path): os.makedirs(path) for page in range(1,11): print(f'——————————————开始下载第{page}页——————————————') params['pageindex'] = str(page) file_path = path + str(page) +'.html' print(requests.get(base_url,params=params).url) with open(file_path,'w',encoding='utf-8')as f : f.write(requests.get(base_url,params=params).text) print('下载完成') if __name__ == '__main__': pageindex = input('请输入文件夹名称') guba(pageindex)

最新回复(0)