OWT (Open WebRTC Toolkit) 混屏模块代码

2020-10-09 本文已影响0人我是榜样

owt

Open WebRTC Toolkit Media Server
The media server for OWT provides an efficient video conference and streaming service that is based on WebRTC

基本概念

owt中的混屏是在类VideoMixer中实现的，owt-server/source/agent/video/videoMixer/VideoMixer.h

VideoMixer实现了根据指定的布局:layoutSolution 生成混屏视频的能力。包括几个基本概念：

input

input表示输入图像的流，每个input有一个整数的inputIndex

通过addInput添加input，其中FrameSource，会提供待解码的数据，来源于webrtc的videoReceiveStream，由VideoMixer内部解码

bool VideoMixer::addInput(const int inputIndex, const std::string& codec, owt_base::FrameSource* source, const std::string& avatar)

output

output表示需要混屏的输出流，每一个output有一个字符串outStreamID，在内部被映射为一个唯一的int outputIndex

bool VideoMixer::addOutput(
    const std::string& outStreamID
    , const std::string& codec
    , const owt_base::VideoCodecProfile profile
    , const std::string& resolution
    , const unsigned int framerateFPS
    , const unsigned int bitrateKbps
    , const unsigned int keyFrameIntervalSeconds
    , owt_base::FrameDestination* dest)

FrameDestination 送到VideoSendStream，进行编码和打包为rtp包

layoutSolution

layoutSolution是一个混屏图像的布局描述信息，包括每个区域的坐标，大小等信息，最重要的是每一个区域包括一个inputIndex，混屏线程根据inputIndex去查找这个流的最后一帧图像

所有布局的变化都体现在layoutSolution中。
一个VideoMixer只支持一个布局，所有output都是同一个布局。

void VideoMixer::updateLayoutSolution(LayoutSolution& solution);

VideoMixer内部逻辑

一、VideoMixer

VideoMixer内部包含一个VideoFrameMixer m_frameMixer，VideoMixer只是对外部提供简单的api，并没有什么实际的工作。工作全部交给VideoFrameMixer完成。

VideoMixer的意义：

1、对外提供简单api

2、将外部string类型的outputStreamId转换为VideoFrameMixer需要的整数outputIndex

二、VideoFrameMixer

VideoFrameMixer完成了input数据的解码工作，解码器是在VideoFrameMixerImpl::addInput这里创建的，并且将流程串联起来，形成了inputsource->decoder->compositorIn->m_compositor的流程

bool VideoFrameMixerImpl::addInput(int input, owt_base::FrameFormat format, owt_base::FrameSource* source, const std::string& avatar) {
    boost::shared_ptr<CompositeIn> compositorIn(new CompositeIn(input, avatar, m_compositor));
    source->addVideoDestination(decoder.get());
    decoder->addVideoDestination(compositorIn.get());
}

VideoFrameMixer的意义：

1、完成数据解码

2、串联工作流程

3、给内部的SoftVideoCompositor::m_compositor提供输入图像

三、SoftVideoCompositor

SoftVideoCompositor完成了不同帧率，相同布局的outout的混屏工作。

对于不同帧率，巧妙的通过内部的2个SoftFrameGenerator，来生成不同帧率的混屏数据。

SoftVideoCompositor::SoftVideoCompositor(uint32_t maxInput, VideoSize rootSize, YUVColor bgColor, bool crop)
    : m_maxInput(maxInput)
{
    m_inputs.resize(m_maxInput);
    for (auto& input : m_inputs) {
        input.reset(new SoftInput());
    }
    m_avatarManager.reset(new AvatarManager(maxInput));
    m_generators.resize(2);
    m_generators[0].reset(new SoftFrameGenerator(this, rootSize, bgColor, crop, 60, 15));
    m_generators[1].reset(new SoftFrameGenerator(this, rootSize, bgColor, crop, 48, 6));
}

通过构造函数可以看到，SoftVideoCompositor保存了input，input的数据，通过上层工作流中的compositorIn调用pushInput输入进来，并进行缓存：

void SoftVideoCompositor::pushInput(int input, const Frame& frame)
{
    assert(frame.format == owt_base::FRAME_FORMAT_I420);
    webrtc::VideoFrame* i420Frame = reinterpret_cast<webrtc::VideoFrame*>(frame.payload);
    m_inputs[input]->pushInput(i420Frame);
}

2个SoftFrameGenerator，分别处理了一批相关的帧率的混屏工作，比如：

FPS为15 30 60在m_generators[0]中混屏

FPS为6 12 24 48在m_generators[1]中混屏

SoftFrameGenerator的意义：

1、为所有output缓存公用的输入图像

2、将不同帧率，交给擅长某个帧率处理的SoftFrameGenerator中处理，节省混屏次数，从而节省性能

四、SoftFrameGenerator

SoftFrameGenerator是真正做混屏工作的类，它处理2倍关系的一批帧率的output的混屏工作。

混屏工作，通过timer线程进行触发，最终执行到SoftFrameGenerator::onTimeout()函数中，做混屏的具体计算

此函数内部，通过m_counter变量，巧妙的实现了不同帧率只需混屏一次的性能优化。

onTimeout会按照最大帧率来触发，通过下面的核心代码，判断了是否需要混屏，混屏后是否需要回调，触发哪个帧率的output的回调

bool hasValidOutput = false;
    {
        boost::unique_lock<boost::shared_mutex> lock(m_outputMutex);
        for (uint32_t i = 0; i < m_outputs.size(); i++) {
            if (m_counter % (i + 1)) 
                continue;
            if (m_outputs[i].size() > 0) {
                hasValidOutput = true;
                break;
            }
        }
    }

这个逻辑很巧妙，但理解起来有点困难，可以通过下图进行理解：

image.png