用python下载哔哩哔哩视频

没想到我还从来没用python下载过视频，这次决定拿B站来练习一下。

用开发者工具查看页面元素，也没找到视频的真实链接，只好用charles抓包一下。具体的下载和证书安装过程忽略。

设置charles

设置proxy settings
设置ssl proxy settings，抓取https链接，不然会出现乱码

设置macOS Proxy，抓取PC端数据

4.刷新视频页面，应该就会出现很多链接了。仔细查找一下，就可以找到视频链接

下载视频

找到视频链接后，以为就可发起请求了，结果把自己给坑了一下。因为视频链接的请求参数有个别一直在变化，并且在全局搜索的时候也没找到生成参数的方法。倒是在页面源码里可以找到链接，还是get请求，嗯，还是应该感到开心的😅

def download_video():
    headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
    }
    video_url = 'https://www.bilibili.com/video/av28518492'
    res = requests.get(video_url, headers=headers, verify=False)
    origin_txt = re.findall(r'<script>window.__playinfo__=(\{.*?\})</script>', res.text, re.S)[0]
    origin_json = json.loads(origin_txt, encoding='utf-8')
    urls = origin_json['durl']

    size = 0
    chunk = 1024
    content_size = sum([i['size'] for i in urls])
    print('file size is %0.2f MB' % (content_size / chunk / 1024))
	
    # 创建存放视频片段的临时文件夹
    if not os.path.exists('temp_video'):
        os.makedirs('temp_video')
        
    start = time.time()
    # 循环下载视频
    for i, data in enumerate(urls):
        url = data['url']
        header = {
            'Origin': 'https://www.bilibili.com',
            'Referer': video_url,
            }
       headers.update(header)
       try:
            response = requests.get(url, headers=headers, verify=False, stream=True)
            video_path = 'temp_video/' + '{}.mp4'.format(i)
            # 下载视频
            with open(video_path, 'wb') as file:
                for item in response.iter_content(chunk):
                    file.write(item)	# 写入视频
                    file.flush()	# 清空缓存
                    size += len(item)	
                    print('\r' + '[下载进度]：%s %0.2f%%' % ('>' * int(size * 50 / content_size), float(size / content_size) * 100), end='')		# end=‘’不换行打印

        except Exception as e:
            print(e)
     stop = time.time()
     print('\n' + '视频下载完成，耗时%.2f秒' % (stop-start))

合并视频

下载下来的视频，其实是分段的，不方面查看，所以还需要将视频给合并起来，这里用的是ffmpeg命令，最后合并的视频保存为output.mp4，存在当前路径下。

ffmpeg -f concat -safe 0 -i file.txt -c copy output.mp4

其中file.txt是所有视频片段路径，格式如下：

1
2
3

file 'video/v_1.mp4'
file 'video/v_2.mp4'
file 'video/v_3.mp4'

合并视频是用subprocess模块来运行ffmpeg命令，详细代码如下：

def concatenate(path, dest='video'):
    """
    将给定路径下的视频进行合并，同时删除原本的视频
    :param path: 需要合并的视频所在文件夹名字，一般是视频名字
    :param dest: 合并之后的视频存放路径，默认为video文件夹
    :return:
    """
    with open('file.txt', 'a', encoding='utf-8') as f:
        for root, dirs, files in os.walk(path):
            for file in files:
                # 如果给定的路径下有视频，则将视频路径信息写入到txt中
                if os.path.splitext(file)[1] in ['.flv', '.mkv', '.mp4']:
                    video_path = os.path.join(root, file)
                    line = "file '{}'\n".format(video_path)
                    f.writelines(line)
                    
    # 合并视频
    if os.path.exists('file.txt'):
        if not os.path.exists(dest):
        	os.makedirs(dest)
    	video_save_path = os.path.join(dest, path)
        try:
            print('开始合并视频...')
            print(path)
            ffmpeg_command = ["ffmpeg", "-f", "concat", "-safe", "0", "-i", "file.txt", "-c", "copy", video_save_path + ".mp4"]
            subprocess.run(ffmpeg_command)
            subprocess.run(["rm", "file.txt"])
            subprocess.run(["rm", "-r", path])
            print('视频合并完成！')
        except Exception as e:
            print('视频合并失败')
            print(e)

代码示例

参考文章

How to join two video files using Python?

python爬虫抓取B站小视频排行榜