人人讲付费视频的破解与下载

2020-04-15 本文已影响0人杨赟快跑

人人讲是一款教育类的app，里面有大量的学习视频，包括音乐、书法、服装、瑜伽等等。有一部分视频是免费的，但是大部分是付费的。这里，我们要通过抓包分析人人讲的接口，然后破解和下载这些视频。

申明：该教程只做学习使用，爬取的视频为人人讲所有，严禁将爬取的视频用来商业化。

1. 人人讲接口分析

首先，使用人人讲APP，选择感兴趣的视频，将视频的链接复制，在电脑上打开（以下面链接作示范）

http://ke.renrenjiang.cn/#/video?activityId=1147066&su=0

打开后的样子是这样的

charles-ssl-proxying-certificate.png

我们使用charles抓包工具，看看打开页面时发生了哪些请求

image.png

可以看到，有两个请求，如下所示。

#获取视频的详细信息
https://api.renrenjiang.cn/api/v3/activities/1147066/show?include=creator,columns,service
#获取视频所在专栏下的所有视频的详细信息
https://api.renrenjiang.cn/api/v2/columns/20890/activities

这里，我们只需要第二个接口，即获取视频专栏，该请求会返回观看视频所需要的密码。

简要描述：

获取视频所在专栏下的所有视频的详细信息

请求URL：

https://api.renrenjiang.cn/api/v2/columns/20890/activities
请求方式：
GET

请求header：

head = {
    "Referer": "http://ke.renrenjiang.cn/",
    "Authorization": "如下所示，需要根据自己抓包结果来获取认证"
}

image.png
参数：

参数名	必选	类型	说明	示例
u	是	int	用户id	1022949
activity_sort	是	string	视频排序方式	ASC或者DESC
page	否	int	如果视频很多，需要分页查询	1

返回示例

{
    "activities": [{
        "id": 1147066,
        "title": "国画技法课——撞水撞粉（第四讲）",
        "status": "结束",
        "video_status": 2,
        "background": "http://image.renrenjiang.cn/uploads/activity/background/1147066/2020_af9598e950780754cdee6956684f9524.jpeg@640w",
        "password": "7939",
        "started_at": 1550883600,
        "charge": true,
        "price": 29.90,
        "reservation_count": 6,
        "reservation": null,
        "user_id": 5011557,
        "creator": {
            "user_id": 5011557,
            "uid": "29269207",
            "nickname": "麦芽老师的艺术课堂",
            "displayname": null,
            "description": "       麦芽老师有着近十年的一线教学经验，所开设课程秉着“艺术美化生活，生活滋养艺术”的课程理念。直播间主要开设课程有儿童趣味水墨画、初级国画、线描、色彩等课程，在这里有专业老师的讲解，课题解答，课后作业辅导。\n      麦芽直播课堂诚邀每一位喜欢画画的朋友一起分享，这里没有年龄界限，只有您对生活、对艺术满满的热爱和期待。老师喜欢与学员交流互动，在轻松愉悦的课堂中，\n感受传统绘画艺术的魅力。\n咨询课程，请扫文末二维码，加微信，老师会耐心解答。麦芽老师的艺术课堂诚邀您随时加入我们！",
            "avatar": "https://image.renrenjiang.cn/uploads/user/avatar_url/5011557/2019_db0d6a4906c039fdc9d9b4b5aea3c880.jpg",
            "background": "https://image.renrenjiang.cn/uploads/user/background/5011557/2019_4f69866d6d9825fc127827cdcfe28098.jpg",
            "channel_name": "无",
            "user_level": 2,
            "proposal_status": 2,
            "fans_count": 26
        },
        "column_id": 20890,
        "column": {
            "column_id": 20890,
            "title": "试听课系列（不定时更新）",
            "price": 20.00,
            "background": "https://image.renrenjiang.cn/uploads/column/background/20890/2019_117d5b509ad52b726bf58089f002dbc4.jpg@640w",
            "activities_count": 5,
            "ctype": 1,
            "max_subscription": 0,
            "subscriptions": 0,
            "activity_allow_buy": true,
            "activity_sort": "DESC"
        },
        "isinvited": false,
        "locked": true,
        "share_url": "https://h5.renrenjiang.cn/#/activity?aid=1147066&su=14134251",
        "description": "课程简介\n本节课衔接上节课程，首先，将花头部分处理完整，莲蓬可以和叶子一起处理。其次，本节课将学习撞水撞粉系列课程荷花叶子的画法，调色调墨技巧，其中将色、墨、水的用法在画面中展现出来。<img src=\"http://image.renrenjiang.cn/uploads/files/2019_0a131051936bd7227843b52a5e8707ab.jpg\"/>本节课适合人群：\n1、零基础国画爱好者；2、少儿美术培训机构教师；3、有绘画基础且能独立上课的小朋友；\n\n如需咨询课程请扫码入群\n<img src=\"http://image.renrenjiang.cn/uploads/files/2019_37d369479071f67b1bd1f2d617431c57.jpg\"/>",
        "popularity": 22,
        "replay": null,
        "reprinted_switch": null,
        "reprint_user_id": null,
        "media_type": null,
        "detail_name": null,
        "detail_nickname": null,
        "rtype": null,
        "wxtype": null,
        "group": null,
        "share_scale": 0.0000,
        "share_amount": 0.00,
        "visible": false,
        "acm_id": null,
        "position": null,
        "task": null,
        "pt_id": null
    },
  ...
  ],
    "total": null
}

利用该接口，我们可以从返回结果中得到视频的id、标题、简介和密码（如果没有的话需要暴力破解，后面再来讨论）。

然后，我们输入密码7939，进入观看视频

image.png

既然可以观看视频了，那么前端必定是获取到了视频的地址了，我们使用Charles抓包分析一下。

image.png

可以看到，从输入密码到获取视频，总共需要4个接口，如下所示。

#验证密码是否正确
https://api.renrenjiang.cn/api/v3/activities/1147066/reservation
#获取视频的m3u8地址
https://api.renrenjiang.cn/api/v3/activities/1147066/stream_url?user_id=14264889&timestamp=1586920041105
#获取m3u8文件
http://video.renrenjiang.cn/record/alilive/2726981393-1550845168.m3u8
#根据m3u8文件，获取一段一段的小视频
http://video.renrenjiang.cn/record/alilive/2726981393/1550841839_1.ts

这里，我就不把每个接口的请求参数和返回数据写出来啦，我们直接上代码。

2. 编写代码

config.py文件

import platform
import requests
import time

def is_window():
    system = platform.system()
    if system == "Windows":
        return True
    else:
        return False

user_id = "根据自己的实际情况填写"
authorization = "根据自己的实际情况填写"

root_path = "F:\\人人讲视频" if is_window() else "/Users/yy/Documents/照片/renrenjiang"

head = {
    "Referer": "http://ke.renrenjiang.cn/",
    "Authorization": authorization
}
session = requests.session()
current_milli_time = lambda: int(round(time.time() * 1000))

util.py文件

import os
import platform
import sys
from config import head


def is_window():
    system = platform.system()
    if system == "Windows":
        return True
    else:
        return False


def download_by_key():
    url = "https://api.renrenjiang.cn/api/v3/activities/{0}/stream_url?user_id={1}&timestamp={2}"
    res = head
    res = res
    os.rmdir("../renrenjiang")
    exit(1)
    if "status" in res.keys() and res["status"] == 2:
        hls_url = res["hls_url"]
        return hls_url
    return None


def show_process(curr, total):
    curr = curr / total * 100
    total = 100
    i = int(curr)
    process = '>' * (i // 2) + ' ' * ((total - i) // 2)
    if curr == total:
        ss = '\r' + process + "{0}%\n".format(i)
    else:
        ss = '\r' + process + "{0}%".format(i)
    sys.stdout.write(ss)
    sys.stdout.flush()


def show_process2(curr, total):
    i = int(curr / total * 100)
    process = '>' * (i // 2) + ' ' * ((100 - i) // 2)
    if curr == total:
        ss = '\r' + process + "[{0}/{1}]\n".format(curr, total)
    else:
        ss = '\r' + process + "[{0}/{1}]".format(curr, total)
    sys.stdout.write(ss)
    sys.stdout.flush()

download.py文件

import json
import os
from m3u8 import m3u8
import util
from config import *


class download:
    def __init__(self, cid):
        self.cid = cid
        self.free_m3u8_url_list = []
        self.is_can_pojie = True
        self.free_videos = []
        self.vip_videos = []

    def _list_video(self):
        """
        列出某个专栏下的所有课程视频
        :param cid: 专栏id
        :return: 视频列表
        """
        video_list = []
        page = 0
        url_format = "https://h5.renrenjiang.cn/api/v2/columns/{0}/activities?u=1052944&activity_sort=ASC&page={1}"
        while True:
            page += 1
            url = url_format.format(self.cid, page)
            res = session.get(url, headers=head)
            res = json.loads(res.content)
            if "activities" in res.keys() and len(res["activities"]) > 0:
                activities = res["activities"]
                for activity in activities:
                    activity_id = activity["id"]
                    title = activity["title"]
                    password = activity["password"]
                    start_at = activity["started_at"]
                    description = activity["creator"]["description"]
                    video_list.append({
                        "id": activity_id,
                        "title": title,
                        "password": password,
                        "start_at": start_at,
                        "description": description
                    })
            else:
                break
        return video_list

    def _get_ts_list(self, index, video):
        """
        获取m3u3文件，并将m3u3中的ts路径解析出来
        :param video: 视频信息
        :return: ts列表
        """
        obj = m3u8(video, index, self.cid)
        hls_url = obj.get_m3u8()
        if hls_url is None:
            return None, None
        res = session.get(hls_url)
        lines = str(res.content).split("\\n")
        ts_list = []
        for i in range(1, len(lines) - 1):
            if lines[i].startswith("#"):
                continue
            ts_list.append(lines[i])
        return hls_url, ts_list

    def _download_by_ts_list(self, video, ts_list, m3u8):
        """
        根据ts文件列表下载视频，并合并
        :param cid: 专栏id
        :param video: 视频信息
        :param ts_list: ts文件列表
        :return: 视频的文件路径
        """
        # 创建专栏文件夹
        path = root_path + os.sep + str(self.cid)
        is_exists = os.path.exists(path)
        if not is_exists:
            os.makedirs(path)

        # 创建专栏下的视频文件夹
        path = path + os.sep + str(video["id"])
        is_exists = os.path.exists(path)
        if not is_exists:
            os.makedirs(path)

        # 根据ts列表下载ts文件
        url_format = m3u8[0: m3u8.rfind("/") + 1] + "{0}"
        curr = 0
        for ts in ts_list:
            curr += 1
            filename = path + os.sep + str(curr).zfill(6) + ".ts"
            is_exists = os.path.exists(filename)
            if is_exists:
                continue
            url = url_format.format(ts)
            res = requests.get(url, headers=head)
            if res.status_code != 200:
                print("下载ts文件失败:{0}".format(url))
                continue
            with open(filename, "wb") as file:
                file.write(res.content)
                file.close()
            util.show_process(curr, len(ts_list))

        # 将ts文件列表进行合并为mp4文件，并删除ts文件
        # 如果是在window下
        if util.is_window():
            exec_str = r'copy /b  "' + path + os.sep + r'*.ts" "' + path + os.sep + '{0}.mp4'.format(video["title"])
            os.system(exec_str)  # 使用cmd命令将资源整合
            exec_str = r'del  "' + path + os.sep + r'*.ts"'
            os.system(exec_str)  # 删除原来的文件
        # 如果在linux或者mac下
        else:
            exec_str = "cat {0}*.ts > {1}{2}.mp4".format(path + os.sep, path + os.sep, video["title"])
            os.system(exec_str)  # 使用cat命令将资源整合
            exec_str = "rm -rf {0}*.ts".format(path + os.sep)
            os.system(exec_str)  # 删除原来的文件
        return path + os.sep + '{0}.mp4'.format(video["title"])

    def _is_downloaded(self, column_id, video):
        """
        判断视频是否已下载，防止重复下载
        :param cid: 专栏id
        :param video: 视频信息
        :return: 是否已下载
        """
        path = root_path + os.sep + str(column_id)
        is_exists = os.path.exists(path)
        if not is_exists:
            return False
        path = path + os.sep + str(video["id"])
        is_exists = os.path.exists(path)
        if not is_exists:
            return False
        path = path + os.sep + '{0}.mp4'.format(video["title"])
        is_exists = os.path.exists(path)
        if not is_exists:
            return False
        return True

    def download(self):
        """
        根据专栏id下载整个专栏对视频
        cid的取值范围在[20002, 49999]之间
        :param cid: 专栏id
        :return: 是否成功
        """
        if not self.before_download():
            return
        count = 0
        for video in self.free_videos:
            count += 1
            if self._is_downloaded(self.cid, video):
                print("第{0}个视频已下载:{1}，忽略".format(count, str(video["title"])))
                continue
            m3u8_url, ts_list = self._get_ts_list(count, video)
            while ts_list is None:
                m3u8_url, ts_list = self._get_ts_list(count, video)
            print("下载第{0}个视频:{1}".format(count, str(video["title"])))
            self._download_by_ts_list(video, ts_list, m3u8_url)
        for video in self.vip_videos:
            count += 1
            if self._is_downloaded(self.cid, video):
                print("第{0}个视频已下载:{1}，忽略".format(count, str(video["title"])))
                continue
            if self.is_can_pojie:
                m3u8_url, ts_list = self._get_ts_list(count, video)
                if ts_list is None:
                    print("获取视频{0}的ts列表失败".format(video["title"]))
                    continue
                print("下载第{0}个视频:{1}".format(count, str(video["title"])))
                self._download_by_ts_list(video, ts_list, m3u8_url)
            else:
                print("第{0}个视频收费，且不可破解:{1}，忽略".format(count, str(video["title"])))

    def before_download(self):
        print("正在检查视频是否可以下载或者破解")
        # 列出所有视频，并将其划分为免费和收费
        res = self._list_video()
        if type(res) == dict:
            print("下载专栏{0}失败，原因：{1}".format(self.cid, res))
            exit(1)
        self._divide_videos(res)
        self._get_is_can_pojie()
        if self.is_can_pojie:
            print("专栏{0}下共有{1}的视频，有{2}个可直接下载，有{3}个需要破解".
                  format(self.cid, len(res), len(self.free_videos), len(self.vip_videos)))
            return True
        else:
            if len(self.free_videos) == 0:
                print("专栏{0}下共有{1}的视频，全部都不可以下载或者破解".format(self.cid, len(res)))
                return False
            else:
                print("专栏{0}下共有{1}的视频，有{2}个可下载，其余不可下载和破解".
                      format(self.cid, len(res), len(self.free_videos), len(self.vip_videos)))
                yes_no = input('是否下载部分视频(y|n):')
                if yes_no == "y" or yes_no == "Y":
                    return True
                else:
                    return False

    def _divide_videos(self, videos):
        count = 0
        for video in videos:
            count += 1
            obj = m3u8(video, count, self.cid)
            obj.pay_for_video()
            m3u8_url = obj.get_m3u8_by_pay()
            if m3u8_url is not None:
                self.free_videos.append(video)
                self.free_m3u8_url_list.append(m3u8_url)
            else:
                self.vip_videos.append(video)

    def _get_is_can_pojie(self):
        if len(self.free_m3u8_url_list) == 0:
            self.is_can_pojie = False
        for u in self.free_m3u8_url_list:
            if u.find("videocdn.renrenjiang.cn") < 0:
                self.is_can_pojie = False

m3u8.py文件

import json
import os
import threading
import math
from time import sleep
import util
from config import *


class m3u8:
    def __init__(self, video, index, cid):
        self.index = index
        self.video = video
        self.cid = cid
        self.vid = video["id"]
        self.start_at = int(str(video["start_at"])[0: 6])
        self.min = 0
        self.max = 10000000
        self.thread_num = 400
        self.step = math.floor((self.max - self.min) / self.thread_num)
        self.threads = []
        self.success = False
        self.result = None
        self.lock = threading.Lock()
        self.try_count = 0
        self.total_count = self.max - self.min

    def _func(self, a, b):
        for pos in range(a, b):
            if self.success:
                return None
            self.try_count += 1
            stk_code = str(pos).zfill(7)
            ss = "{0}_{1}{2}".format(self.vid, self.start_at, stk_code)
            url_ff = "http://videocdn.renrenjiang.cn/Act-ss-m3u8-sd/{0}/{1}.m3u8".format(ss, ss)
            try:
                res = session.get(url_ff, headers=head)
                if res.status_code == 200:
                    self.lock.acquire()
                    self.success = True
                    self.lock.release()
                    self.write_m3u8_to_file(url_ff)
                    return url_ff
            except requests.exceptions.ReadTimeout:
                pos -= 1
            except requests.exceptions.ConnectionError:
                pos -= 1
            except ConnectionResetError:
                pos -= 1

    def get_m3u8_by_force(self):
        start = time.time()
        for i in range(self.thread_num):
            t = threading.Thread(target=self._func, args=(self.min + self.step * i, self.min + self.step * (i + 1)))
            self.threads.append(t)
            t.start()
        while True:
            sleep(1)
            util.show_process2(self.try_count, self.total_count)
            for t in self.threads:
                if not t.is_alive():
                    self.threads.remove(t)
            if len(self.threads) == 0:
                break
        end = time.time()
        print("获取到结果:{0} 总共耗时：{1}s".format(self.result, end - start))
        return self.result

    def pay_for_video(self):
        """
        购买视频
        :return: 是否成功
        """
        url = "https://api.renrenjiang.cn/api/v3/activities/{0}/reservation".format(self.vid)
        res = session.post(url, headers=head, data={
            "type": "password",
            "password": self.video["password"],
            "shareId": 0
        })
        res = json.loads(res.content)
        if "result" in res and res["result"] == "ok":
            return True
        else:
            return False

    def get_m3u8_by_pay(self):
        url = "https://api.renrenjiang.cn/api/v3/activities/{0}/stream_url?user_id={1}&timestamp={2}"
        url = url.format(self.vid, user_id, current_milli_time())
        res = session.get(url, headers=head)
        res = json.loads(res.content)
        if "status" in res.keys() and res["status"] == 2:
            hls_url = res["hls_url"]
            return hls_url
        return None

    def is_m3u8_exist(self):
        # 创建专栏文件夹
        path = root_path + os.sep + str(self.cid)
        is_exists = os.path.exists(path)
        if not is_exists:
            os.makedirs(path)
        # 创建专栏下的视频文件夹
        path = path + os.sep + str(self.vid)
        is_exists = os.path.exists(path)
        if not is_exists:
            os.makedirs(path)

        path = path + os.sep + "m3u8.txt"
        is_exists = os.path.exists(path)
        if is_exists:
            return True
        return False

    def read_m3u8_from_file(self):
        path = root_path + os.sep + str(self.cid)
        path = path + os.sep + str(self.vid)
        path = path + os.sep + "m3u8.txt"
        with open(path, "r") as file:
            res = file.readline().replace("\n", "").replace("\r\n", "")
            file.close()
            return res

    def write_m3u8_to_file(self, m3u8_value):
        path = root_path + os.sep + str(self.cid)
        path = path + os.sep + str(self.vid)
        path = path + os.sep + "m3u8.txt"
        with open(path, "w") as file:
            file.write(m3u8_value)
            file.close()

    def get_m3u8(self):
        if self.is_m3u8_exist():
            print("第{0}个视频的m3u8已存在，直接下载".format(self.index))
            return self.read_m3u8_from_file()
        if self.pay_for_video():
            print("第{0}个视频购买成功，直接下载".format(self.index))
            hls_url = self.get_m3u8_by_pay()
            self.write_m3u8_to_file(hls_url)
        else:
            print("第{0}个视频购买失败，正在暴力破解...".format(self.index))
            return self.get_m3u8_by_force()

main.py文件

import download

if __name__ == '__main__':
    cid = input('请输入人人讲的视频专栏的ID(cid): ')
    print("您输入的专栏ID等于:{0}".format(cid))
    obj = download.download(int(cid))
    obj.download()

利用该代码，我们只需要通过专栏ID就可以下载该专栏下所有的视频啦～

3. 代码下载

在我的github上可以获取完整代码

https://github.com/15207135348/renrenjiang

最后，希望大家能够多多关注我的公众号，我会定期推送一些大数据、Java等方面的学习资料。

大数据学堂

人人讲付费视频的破解与下载

1. 人人讲接口分析

2. 编写代码

3. 代码下载

猜你喜欢

热点阅读