Metal学习笔记（六） -- Metal渲染视频

2020-09-01 本文已影响0人 iOSer_jia

除了渲染摄像头采集数据，我们还可以通过Metal渲染视频文件。不同的是，视频文件经过编码，并且采用的是YUV颜色空间，所以除了解码，我们还需要矩阵将YUV转化为RGB颜色空间。

基本思路

采用AVFoundation的AVAssetReader解码视频文件获得CMSampleBufferRef，再通过CoreVideo转化得到MTLTexture对象（YUV），最后将MTLTexture和YUV转RGB的矩阵传入Metal，完成渲染。

思维导图.png

视频解码

关于AVAssetReader，可以通过苹果官方文档得知，它一个用来获得视频数据的工具类。

AVAssetReader lets you:

Read raw un-decoded media samples directly from storage, obtain samples decoded into renderable forms.

Mix multiple audio tracks of the asset and compose multiple video tracks by using AVAssetReaderAudioMixOutput and AVAssetReaderVideoCompositionOutput.

The AVAssetReader pipelines are multithreaded internally. After you initiate reading with initWithAsset:error:, a reader loads and processes a reasonable amount of sample data ahead of use so that retrieval operations such as copyNextSampleBuffer (AVAssetReaderOutput) can have very low latency. AVAssetReader is not intended for use with real-time sources, and its performance is not guaranteed for real-time operations.

由于本文不对音频做探究，所以只获取视频轨道数据。

@implementation LJAssetReader {
    AVAssetReaderTrackOutput *readerVideoTrackOutput;
    AVAssetReader *assetReader;
    NSURL *videoUrl;
    NSLock *lock;
}

- (instancetype)initWithUrl:(NSURL *)url {
    if (self = [super init]) {
        videoUrl = url;
        lock = [[NSLock alloc] init];
        [self setupAsset];
    }
    return self;
}

- (void)setupAsset {
    NSDictionary *inputOption = @{AVURLAssetPreferPreciseDurationAndTimingKey: @(YES)};
    
    AVURLAsset *inputAsset = [[AVURLAsset alloc] initWithURL:videoUrl options:inputOption];
    __weak typeof(self) weakSelf = self;
    
    NSString *tracks = @"tracks";
    
    [inputAsset loadValuesAsynchronouslyForKeys:@[tracks] completionHandler:^{
        __strong typeof(self) strongSelf = weakSelf;
        
        dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
            NSError *error = nil;
            AVKeyValueStatus trackStatus = [inputAsset statusOfValueForKey:@"tracks" error:&error];
            if (trackStatus != AVKeyValueStatusLoaded) {
                NSLog(@"error:%@", error);
                return;
            }
            
            [weakSelf processWithAsset:inputAsset];
            
            
        });
    }];
}

- (void)processWithAsset:(AVAsset *)asset {
    [lock lock];
    NSLog(@"processWithAsset");
    
    NSError *error = nil;
    
    assetReader = [AVAssetReader assetReaderWithAsset:asset error:&error];
    
    NSMutableDictionary *outputSettings = [NSMutableDictionary dictionary];
    [outputSettings setObject:@(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange) forKey:(id)kCVPixelBufferPixelFormatTypeKey];
    
    readerVideoTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:[[asset tracksWithMediaType:AVMediaTypeVideo] firstObject] outputSettings:outputSettings];
    
    readerVideoTrackOutput.alwaysCopiesSampleData = NO;
    
    [assetReader addOutput:readerVideoTrackOutput];
    
    if ([assetReader startReading] == NO) {
        NSLog(@"error reading");
    }
    
    [lock unlock];
}

- (CMSampleBufferRef)readBuffer {
    [lock lock];
    
    CMSampleBufferRef sampleBuffer = nil;
    
    if (readerVideoTrackOutput) {
        sampleBuffer = [readerVideoTrackOutput copyNextSampleBuffer];
    }
    
    if (assetReader && assetReader.status == AVAssetReaderStatusCompleted) {
        NSLog(@"customInit");
        
        readerVideoTrackOutput = nil;
        assetReader = nil;
        
        [self setupAsset];
    }
    
    [lock unlock];
    
    return  sampleBuffer;
}

@end

AVAssetReader的使用步骤为，将AVURLAsset作为AVAssetReader的输入源获取视频源数据，再通过AVAssetReaderTrackOutput作为AVAssetReader的输出端口并通过copyNextSampleBuffer获得CMSampleBufferRef。

需要注意的是AVAssetReaderTrackOutput的输出设置里将输出格式设置为kCVPixelFormatType_420YpCbCr8BiPlanarFullRange，则表示输出采用的是4:2:0的YUV颜色空间格式，并且采用的是双平面，即Y通道一个平面，UV通道一个平面，颜色范围为更多的FullRange，这个设置至关重要，关系着Metal获取纹素的计算方式。

Metal配置

关于Metal的配置，这里就不再赘述，直接上代码。

- (void)setupMetal {
    _mtkView = [[MTKView alloc] initWithFrame:self.view.bounds device:MTLCreateSystemDefaultDevice()];
    
    if (!_mtkView.device) {
        NSLog(@"not device");
        return;
    }
    
    [self.view addSubview:_mtkView];
    
    _mtkView.delegate = self;
    
    self.viewportSize = (vector_uint2){self.mtkView.drawableSize.width, self.mtkView.drawableSize.height};
}

- (void)setupPipeline {
    id<MTLLibrary> defaultLibrary = [self.mtkView.device newDefaultLibrary];
    
    id<MTLFunction> vertexFunction = [defaultLibrary newFunctionWithName:@"vertexShader"];
    id<MTLFunction> fragmentFunction = [defaultLibrary newFunctionWithName:@"fragmentShader"];
    
    MTLRenderPipelineDescriptor *pipelineDesc = [[MTLRenderPipelineDescriptor alloc] init];
    pipelineDesc.label = @"my pipeline desc";
    pipelineDesc.vertexFunction = vertexFunction;
    pipelineDesc.fragmentFunction = fragmentFunction;
    pipelineDesc.colorAttachments[0].pixelFormat = self.mtkView.colorPixelFormat;
    
    NSError *error = nil;
    _pipeline = [self.mtkView.device newRenderPipelineStateWithDescriptor:pipelineDesc error:&error];
    
    if (error) {
        NSLog(@"pipeline create error: %@", error.localizedDescription);
        return;
    }
    
    _commandQueue = [self.mtkView.device newCommandQueue];
}

而Metal的片元着色器函数则需要传入两个纹理（Y通道纹理和UV通道纹理）和一个转化矩阵，代码如下：

#include <metal_stdlib>
#import "LJShaderTypes.h"
using namespace metal;

typedef struct
{
    float4 clipSpacePosition [[position]];
    float2 textureCoord;
} RasteizerData;

vertex RasteizerData
vertexShader(uint vertexID [[vertex_id]],
             constant LJVertex *vertexArray [[buffer(LJVertexInputIndexVertices)]])
{
    RasteizerData out;
    out.clipSpacePosition = vertexArray[vertexID].position;
    out.textureCoord = vertexArray[vertexID].textureCoord;
    return out;
}

fragment float4 fragmentShader(RasteizerData input [[stage_in]],
                               texture2d<float> textureY [[texture(LJFragmentTextureIndexTextureY)]],
                               texture2d<float> textureUV [[texture(LJFragmentTextureIndexTextureUV)]],
                               constant LJConvertMatrix *convertMatrix [[buffer(LJFragmentBufferIndexMatrix)]])
{
    
    constexpr sampler textureSampler(mag_filter::linear, min_filter::linear);
    float3 yuv = float3(textureY.sample(textureSampler, input.textureCoord).r, textureUV.sample(textureSampler, input.textureCoord).rg);
    float3 rgb = convertMatrix->matrix * (yuv + convertMatrix->offset);
    
    return  float4(rgb, 1.0);
}

附上Metal和app共有文件的代码

#ifndef LJShaderTypes_h
#define LJShaderTypes_h

#include <simd/simd.h>

typedef struct {
    vector_float4 position;
    vector_float2 textureCoord;
}LJVertex;

typedef struct {
    matrix_float3x3  matrix;
    vector_float3 offset;
}LJConvertMatrix;

typedef enum {
    LJVertexInputIndexVertices = 0,
}LJVertexInputIndex;

typedef enum {
    LJFragmentBufferIndexMatrix = 0,
}LJFragmentBufferIndex;

typedef enum {
    LJFragmentTextureIndexTextureY = 0,
    LJFragmentTextureIndexTextureUV = 1,
}LJFragmentTextureIndex;

#endif /* LJShaderTypes_h */

准备顶点和转换矩阵

- (void)setupVertices {
    static const LJVertex quardVertices[] = {
        { {  1.0, -1.0, 0.0, 1.0 },  { 1.f, 1.f } },
        { { -1.0, -1.0, 0.0, 1.0 },  { 0.f, 1.f } },
        { { -1.0,  1.0, 0.0, 1.0 },  { 0.f, 0.f } },
        
        { {  1.0, -1.0, 0.0, 1.0 },  { 1.f, 1.f } },
        { { -1.0,  1.0, 0.0, 1.0 },  { 0.f, 0.f } },
        { {  1.0,  1.0, 0.0, 1.0 },  { 1.f, 0.f } },
    };
    
    _vertices = [self.mtkView.device newBufferWithBytes:quardVertices length:sizeof(quardVertices) options:MTLResourceStorageModeShared];
    
    _numVertices = sizeof(quardVertices) / sizeof(LJVertex);
}

- (void)setupMatrix {
    //1.转化矩阵
     // BT.601, which is the standard for SDTV.
     matrix_float3x3 kColorConversion601DefaultMatrix = (matrix_float3x3){
         (simd_float3){1.164,  1.164, 1.164},
         (simd_float3){0.0, -0.392, 2.017},
         (simd_float3){1.596, -0.813,   0.0},
     };
     
     // BT.601 full range
     matrix_float3x3 kColorConversion601FullRangeMatrix = (matrix_float3x3){
         (simd_float3){1.0,    1.0,    1.0},
         (simd_float3){0.0,    -0.343, 1.765},
         (simd_float3){1.4,    -0.711, 0.0},
     };
    
     // BT.709, which is the standard for HDTV.
     matrix_float3x3 kColorConversion709DefaultMatrix[] = {
         (simd_float3){1.164,  1.164, 1.164},
         (simd_float3){0.0, -0.213, 2.112},
         (simd_float3){1.793, -0.533,   0.0},
     };
     //2.偏移量
    vector_float3 kColorConversion601FullRangeOffset = (vector_float3){ -(16.0/255.0), -0.5, -0.5};
    
    LJConvertMatrix matrix;
    
    matrix.matrix = kColorConversion601FullRangeMatrix;
    
    matrix.offset = kColorConversion601FullRangeOffset;
    
    _convertMatrix = [self.mtkView.device newBufferWithBytes:&matrix length:sizeof(matrix) options:MTLResourceStorageModeShared];
}

YUV转RGB的矩阵有3种，这里采用了BT.601 full range。

开始渲染

- (void)mtkView:(MTKView *)view drawableSizeWillChange:(CGSize)size {
    _viewportSize = (vector_uint2){size.width, size.height};
}

- (void)drawInMTKView:(MTKView *)view {
    id<MTLCommandBuffer> commandBuffer = [self.commandQueue commandBuffer];
    commandBuffer.label = @"my commadn buffer";
    
    MTLRenderPassDescriptor *renderPassDesc = view.currentRenderPassDescriptor;
    
    CMSampleBufferRef sampleBuffer = [self.reader readBuffer];
    
    if (renderPassDesc && sampleBuffer) {
        renderPassDesc.colorAttachments[0].clearColor = MTLClearColorMake(0.5, 0.5, 0.5, 1.0);
        
        id<MTLRenderCommandEncoder> commandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderPassDesc];
        
        [commandEncoder setRenderPipelineState:self.pipeline];
        
        [commandEncoder setViewport:(MTLViewport){0.0, 0.0, self.viewportSize.x, self.viewportSize.y, -1.0, 1.0}];
        
        [commandEncoder setVertexBuffer:self.vertices offset:0 atIndex:LJVertexInputIndexVertices];
        
        [self setupTextureWithEncoder:commandEncoder buffer:sampleBuffer];
        
        [commandEncoder setFragmentBuffer:self.convertMatrix offset:0 atIndex:LJFragmentBufferIndexMatrix];
        
        [commandEncoder drawPrimitives:MTLPrimitiveTypeTriangle vertexStart:0 vertexCount:self.numVertices];
        
        [commandEncoder endEncoding];
        
        [commandBuffer presentDrawable:view.currentDrawable];
    }
    
    [commandBuffer commit];
}

这部分代码只是常规的渲染，关键点在于setupTextureWithEncoder:buffer:，代码如下

- (void)setupTextureWithEncoder:(id<MTLRenderCommandEncoder>)encoder buffer:(CMSampleBufferRef)samplerBuffer {

    CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(samplerBuffer);
    
    id<MTLTexture> textureY = nil;
    id<MTLTexture> textureUV = nil;
    
    {
        size_t width  = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0);
        size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0);
        
        MTLPixelFormat pixelFormat = MTLPixelFormatR8Unorm;
        
        CVMetalTextureRef temTexture = nil;
        
        CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, self.textureCache, pixelBuffer, NULL, pixelFormat, width, height, 0, &temTexture);
        
        if (status == kCVReturnSuccess) {
            textureY = CVMetalTextureGetTexture(temTexture);
            
            CFRelease(temTexture);
        }
    }
    
    {
        size_t width = CVPixelBufferGetWidthOfPlane(pixelBuffer, 1);
        size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1);
        MTLPixelFormat pixelFormat = MTLPixelFormatRG8Unorm;
        CVMetalTextureRef tmpTexture = NULL;
        CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, self.textureCache, pixelBuffer, NULL, pixelFormat, width, height, 1, &tmpTexture);
        if (status == kCVReturnSuccess) {
            textureUV = CVMetalTextureGetTexture(tmpTexture);
            CFRelease(tmpTexture);
        }
    }
    
    if (textureY != nil && textureUV != nil) {
        [encoder setFragmentTexture:textureY atIndex:LJFragmentTextureIndexTextureY];
        [encoder setFragmentTexture:textureUV atIndex:LJFragmentTextureIndexTextureUV];
    }
    
    CFRelease(samplerBuffer);
    
}

因为在前面我们设置视频流输出格式为kCVPixelFormatType_420YpCbCr8BiPlanarFullRange，所以CVPixelBufferRef有两个平面，我们可以通过CVMetalTextureCacheCreateTextureFromImage函数将planeIndex参数设置为0或1获取不同平面的纹理，另外因为YUV是4:2:0的关系，所以两个平面的宽高并不一致（Y平面的宽高是UV平面宽高的2倍），我们需要使用CVPixelBufferGetWidthOfPlane获取不同平面的宽高。

最后附上demo代码