Metal与图形渲染三：透明通道视频

2021-01-22 本文已影响0人肠粉白粥_Hoben

零. 前言

这是我在Metal系列进行探索的第三篇文章，本章主要会介绍透明通道的视频如何渲染到屏幕上中，在Metal与图形渲染二：透明图片的渲染已经将一张静态的透明通道图片渲染到屏幕上去了，那么，如果这不是一张图片，而是一个视频呢？

现在我们的素材是这样的：

需要按照上一章的逻辑，将他渲染到屏幕上，展示透明特效：

视频还需要加上额外的处理，让我们来看看怎么实现吧！

一. 视频格式与图片格式

作为呈现给用户的载体，视频和图片都有他自己的基础格式，图片是RGB，视频是YUV，我们前面采样用到的图片纹理就是根据RGB格式来采样的，那么，对于视频，我们也需要把他看成若干张图片的结合，再对这些图片进行采样和处理，这时候，需要我们先将YUV转换为RGB，再进行处理。

1. RGB数据

RGB 表示红（Red）、绿（Green）、蓝（Blue），也就是光的三原色，将它们以不同的比例叠加，可以产生不同的颜色。屏幕就是由红、绿、蓝三种发光的颜色小点组成的。

比如一张1080p的图片，代表着有 1920 * 1080 个像素点。如果采用 RGB 编码方式，每个像素点都有红、绿、蓝三个原色，其中每个原色占用 1 个字节，每个像素占用3个字节，则一张1080p的图片就占用 1920 * 1280 * 3 / 1024 / 1024 = 7.03125MB 存储空间。比如著名的BMP位图就是这样保存图片的（所谓的RGB888格式，或者24位位图格式）。

当然，我们手机所看到的一张图片不可能这么大，这是得益于前人研究的图片压缩算法，不然一亿像素的图片，一张就要占几百M..

2. YUV数据

YUV 编码采用了明亮度和色度表示每个像素的颜色。其中Y表示明亮度（Luminance、Luma），也就是灰阶值。U、V 表示色度（Chrominance 或 Chroma）。Y'UV是工程师希望在黑白基础结构中使用彩色电视时发明的。他们需要一种与黑白（B＆W）电视兼容的信号传输方法，同时又能增加色彩。亮度分量已经以黑白信号形式存在，于是他们添加了UV信号作为解决方案。

Y'代表亮度分量（亮度），U和V代表色度（颜色）分量；术语Y'UV，YUV，YCbCr，YPbPr等的范围有时是模棱两可和重叠的。历史上，术语YUV和Y'UV用于电视系统中颜色信息的特定模拟编码，而YCbCr用于颜色信息的数字编码，适用于视频和静止图像压缩和传输，例如MPEG和JPEG。如今，术语YUV在计算机行业中通常用于描述使用YCbCr编码的文件格式。本文也会将YUV视为YCbCr。

CbCr怎么表现颜色呢，看看下面一幅图就知道了：

Y用图片表示如下：

再用RGB和YUV各自的维度看看图片：

3. YUV的采样

既然图片有他的压缩方式，那么我们的视频自然也有自己的压缩方式，由于人类对UV信号并不是那么敏感，因此，我们在采样了Y分量之后，可以对UV分量酌情减少，以达到压缩大小的目的。

目前主流的YUV采样方式有：YUV4:4:4，YUV4:2:2，YUV4:2:0

YUV 4:4:4采样，每一个Y对应一组UV分量。
YUV 4:2:2采样，每两个Y共用一组UV分量。
YUV 4:2:0采样，每四个Y共用一组UV分量。

下图为YUYV的采样示例（YUV422中的一种），大概就可以了解YUV是怎么采样的了。

二. RGB与YUV的转换

如果需要将YUV转换为RGB，我们需要设置转换矩阵和对应的偏移，业界上已经有算好的矩阵，我们照样用就可以了，但我们还需要简单了解一下根据纹理的信息，选择对应的转换矩阵，下面介绍三种转换矩阵：

typedef struct {
    matrix_float3x3 matrix;
    vector_float3 offset;
} HobenConvertMatrix;

// BT.601
static const HobenConvertMatrix HobenYUVColorConversion601 = {
    .matrix = {
        .columns[0] = { 1.164,  1.164, 1.164, },
        .columns[1] = { 0.000, -0.392, 2.017, },
        .columns[2] = { 1.596, -0.813, 0.000, },
    },
    .offset = { -(16.0/255.0), -0.5, -0.5 },
};

// BT.601 Full Range
static const HobenConvertMatrix HobenYUVColorConversion601FullRange = {
    .matrix = {
        .columns[0] = { 1.000,  1.000, 1.000, },
        .columns[1] = { 0.000, -0.343, 1.765, },
        .columns[2] = { 1.400, -0.711, 0.000, },
    },
    .offset = { 0.0, -0.5, -0.5 },
};

// BT.709
static const HobenConvertMatrix HobenYUVColorConversion709 = {
    .matrix = {
        .columns[0] = { 1.164,  1.164, 1.164, },
        .columns[1] = { 0.000, -0.213, 2.112, },
        .columns[2] = { 1.793, -0.533, 0.000, },
    },
    .offset = { -(16.0/255.0), -0.5, -0.5 },
};

上述三种矩阵信息，我们可以看到两个要素：BT601还是BT709、如果是BT601的话是否为Full Range，下面简单介绍一下：

BT601是标清数字电视（SDTV）设置的标准，BT.709为高清数字电视（HDTV）的ITU标准，目前应用最广泛的应该是BT.709了。

fullRange和videoRange
在apple关于图像pixelformat的定义中有关于yuv格式的420v和420f的定义

kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange = '420v', 
/* Bi-Planar Component Y'CbCr 8-bit 4:2:0, video-range (luma=[16,235] chroma=[16,240]).  baseAddr points to a big-endian CVPlanarPixelBufferInfo_YCbCrBiPlanar struct */
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange  = '420f', 
/* Bi-Planar Component Y'CbCr 8-bit 4:2:0, full-range (luma=[0,255] chroma=[1,255]).  baseAddr points to a big-endian CVPlanarPixelBufferInfo_YCbCrBiPlanar struct *

Apple's sample code有对使用哪个矩阵的示范代码，我们可以通过以下代码获取到应该使用哪个矩阵：

OSType pixelFormatType = CVPixelBufferGetPixelFormatType(pixelBuffer);
BOOL isFullYUVRange = (pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange ? YES : NO);

CFTypeRef colorAttachments = CVBufferGetAttachment(pixelBuffer, kCVImageBufferYCbCrMatrixKey, NULL);
HobenConvertMatrix preferredConversion = HobenYUVColorConversion601FullRange;

if (colorAttachments != NULL) {
    if (CFStringCompare(colorAttachments, kCVImageBufferYCbCrMatrix_ITU_R_601_4, kCFCompareCaseInsensitive) == kCFCompareEqualTo) {
        if (isFullYUVRange) {
            preferredConversion = HobenYUVColorConversion601FullRange;
        } else {
            preferredConversion = HobenYUVColorConversion601;
        }
    } else {
        preferredConversion = HobenYUVColorConversion709;
    }
} else {
    if (isFullYUVRange) {
        preferredConversion = HobenYUVColorConversion601FullRange;
    } else {
        preferredConversion = HobenYUVColorConversion601;
    }
}

不过一般来说，用BT.709标准渲染的视频的已经占大多数了，其实可以默认用709矩阵了。。

三. 开始渲染视频

现在我们知道了YUV是怎么转换成RGB了，那现在我们就可以像上一章一样，先提取出视频每一帧的YUV，在渲染过程中，根据转换矩阵转换为纹理的RGB，最后进行采样即可：

由上图我们可以知道，我们需要先根据视频Reader去采集到CMSampleBufferRef，根据CMSampleBufferRef获取CVMetalTextureRef，然后再把它变成MTLTexture，最后传给GPU去渲染。当然，如果有缓存CVMetalTextureCacheRef，比起前一章直接用[self.texture replaceRegion:region mipmapLevel:0 withBytes:imageBytes bytesPerRow:image.size.width * 4]; 会大大提高渲染的性能。

1. 视频采集——获取CMSampleBufferRef

首先我们需要根据AVAssetReader去获取到AVAssetReaderTrackOutput，根据这个Output来获取到CMSampleBufferRef，由于这一章重点是讲视频渲染的过程，所以就略过了，具体播放流程可以Google一下。

2. Y纹理提取和UV纹理提取

我们根据CMSampleBufferRef获取到CVMetalTextureRef

- (CVMetalTextureRef)currentPixelBuffer {
    CMSampleBufferRef movieSampleBuffer = [self.assetReader readBuffer];
    CVMetalTextureRef pixelBuffer = CMSampleBufferGetImageBuffer(movieSampleBuffer);
    return pixelBuffer;
}

根据CVMetalTextureRef，获取Y分量和UV分量的id <MTLTexture>类型的纹理。

注意，在 iOS 中，录制出来的 yuv 放到 CVPixelBufferRef 中，分成两个 plane。Y 数据独占一个 plane，planeIndex = 0, UV 数据共用一个 plane，planeIndex = 1。Y纹理对应的是R通道，UV纹理对应的是RG通道（这个和OpenGL不同，OpenGL对应的是RA通道）。

即Y纹理对应的pixelFormat为MTLPixelFormatR8Unorm，对应的planeIndex为0；
UV纹理对应的pixelFormat为MTLPixelFormatRG8Unorm，对应的planeIndex为1；

我们首先写出一个通用的CVMetalTextureRef转纹理的方法：

- (id <MTLTexture>)textureWithPixelBuffer:(CVMetalTextureRef)pixelBuffer pixelFormat:(MTLPixelFormat)pixelFormat planeIndex:(NSInteger)planeIndex {
    id <MTLTexture> texture = nil;
    
    size_t width = CVPixelBufferGetWidthOfPlane(pixelBuffer, planeIndex);
    size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, planeIndex);
    CVMetalTextureRef textureRef = NULL;
    CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, pixelBuffer, NULL, pixelFormat, width, height, planeIndex, &textureRef);
    if (status == kCVReturnSuccess) {
        texture = CVMetalTextureGetTexture(textureRef);
        CFRelease(textureRef);
    } else {
        texture = nil;
    }
    return texture;
}

传入对应的参数得到Y纹理和UV纹理：

id <MTLTexture> textureY = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatR8Unorm planeIndex:0];
id <MTLTexture> textureUV = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatRG8Unorm planeIndex:1];

3. 转换矩阵的生成

在前面我们也提到过，可以根据CVMetalTextureRef获取到对应信息，从而知道该用哪个转换矩阵，由于转换矩阵也要作为参数传入.metal文件中，因此我们还需要获取其缓冲id <MTLBuffer>：

- (void)setupMatrixWithPixelBuffer:(CVMetalTextureRef)pixelBuffer { // 设置好转换的矩阵
    if (self.convertMatrix) {
        return;
    }
            
    OSType pixelFormatType = CVPixelBufferGetPixelFormatType(pixelBuffer);
    BOOL isFullYUVRange = (pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange ? YES : NO);
    
    CFTypeRef colorAttachments = CVBufferGetAttachment(pixelBuffer, kCVImageBufferYCbCrMatrixKey, NULL);
    HobenConvertMatrix preferredConversion = HobenYUVColorConversion601FullRange;
    
    if (colorAttachments != NULL) {
        if (CFStringCompare(colorAttachments, kCVImageBufferYCbCrMatrix_ITU_R_601_4, kCFCompareCaseInsensitive) == kCFCompareEqualTo) {
            if (isFullYUVRange) {
                preferredConversion = HobenYUVColorConversion601FullRange;
            } else {
                preferredConversion = HobenYUVColorConversion601;
            }
        } else {
            preferredConversion = HobenYUVColorConversion709;
        }
    } else {
        if (isFullYUVRange) {
            preferredConversion = HobenYUVColorConversion601FullRange;
        } else {
            preferredConversion = HobenYUVColorConversion601;
        }
    }
    
    self.convertMatrix = [self.mtkView.device newBufferWithBytes:&preferredConversion
                                                          length:sizeof(HobenConvertMatrix)
                                                         options:MTLResourceStorageModeShared];
}

4. 顶点的生成

这一步和之前渲染图形差不多，也是各取左半边和右半边的纹理，形成顶点映射，在此不赘述了，这里注意一下素材和上一章的不一样，这一次是左边提取alpha，右边提取rgb：

- (void)setupVertexWithRenderDesc:(MTLRenderPassDescriptor *)renderDesc {
    if (self.rgbVertices && self.alphaVertices) {
        return;
    }
    float heightScaling = 1.0;
    float widthScaling = 1.0;
    CGSize drawableSize = CGSizeMake(renderDesc.colorAttachments[0].texture.width, renderDesc.colorAttachments[0].texture.height);
    CGRect bounds = CGRectMake(0, 0, drawableSize.width, drawableSize.height);
    CGRect insetRect = AVMakeRectWithAspectRatioInsideRect(self.videoSize, bounds);
    HobenRenderingResizingMode mode = CGSizeEqualToSize(self.videoSize, CGSizeZero) ? HobenRenderingResizingModeScale : HobenRenderingResizingModeAspectFill;
    switch (mode) {
        case HobenRenderingResizingModeScale:
            heightScaling = 1.0;
            widthScaling = 1.0;
            break;
            
        case HobenRenderingResizingModeAspectFit:
            widthScaling = insetRect.size.width / drawableSize.width;
            heightScaling = insetRect.size.height / drawableSize.height;
            break;
            
        case HobenRenderingResizingModeAspectFill:
            widthScaling = drawableSize.height / insetRect.size.height;
            heightScaling = drawableSize.width / insetRect.size.width;
            break;
    }
    HobenVertex alphaVertices[] = {
        // 顶点坐标 x, y, z, w  --- 纹理坐标 x, y
        { {-widthScaling,  heightScaling, 0.0, 1.0}, {0.0, 0.0} },
        { { widthScaling,  heightScaling, 0.0, 1.0}, {0.5, 0.0} },
        { {-widthScaling, -heightScaling, 0.0, 1.0}, {0.0, 1.0} },
        { { widthScaling, -heightScaling, 0.0, 1.0}, {0.5, 1.0} },
    };
    
    CGFloat offset = .5f;
    
    HobenVertex rgbVertices[] = {
        // 顶点坐标 x, y, z, w  --- 纹理坐标 x, y
        { {-widthScaling,  heightScaling, 0.0, 1.0}, {0.0 + offset, 0.0} },
        { { widthScaling,  heightScaling, 0.0, 1.0}, {0.5 + offset, 0.0} },
        { {-widthScaling, -heightScaling, 0.0, 1.0}, {0.0 + offset, 1.0} },
        { { widthScaling, -heightScaling, 0.0, 1.0}, {0.5 + offset, 1.0} },
    };
    self.rgbVertices = [_device newBufferWithBytes:rgbVertices length:sizeof(rgbVertices) options:MTLResourceStorageModeShared];
    self.numVertices = sizeof(rgbVertices) / sizeof(HobenVertex);
    self.alphaVertices = [_device newBufferWithBytes:alphaVertices length:sizeof(alphaVertices) options:MTLResourceStorageModeShared];
}

5. MSL编写

.metal文件中，我们实现的需求和之前差不多，都是需要左右各提取rgb和r，因此Metal结构和顶点着色器保持不变

typedef struct {
    float4 vertexPosition [[ position ]];   // 顶点坐标
    float2 textureCoorRgb;                  // 取RGB值的纹理坐标
    float2 textureCoorAlpha;                // 取Alpha值的纹理坐标
} RasterizerData;

vertex RasterizerData vertexShader(uint vertexId [[ vertex_id ]],
                                   constant HobenVertex *rgbVertexArray [[ buffer(0) ]],
                                   constant HobenVertex *alphaVertexArray [[ buffer(1) ]]) {
    RasterizerData out;
    out.vertexPosition = rgbVertexArray[vertexId].position;
    out.textureCoorRgb = rgbVertexArray[vertexId].textureCoordinate;
    out.textureCoorAlpha = alphaVertexArray[vertexId].textureCoordinate;
    return out;
}

由于加入了Y纹理输入、UV纹理输入，还增加了转换矩阵的缓冲输入，因此片段着色器增加对应的输入通道，提取Y纹理的R通道，提取UV纹理的RG通道，并使用转换矩阵转换为RGB：

float3 rgbFromYuv(float2 textureCoor,
                  texture2d <float> textureY,
                  texture2d <float> textureUV,
                  constant HobenConvertMatrix *convertMatrix) {
    
    constexpr sampler textureSampler (mag_filter::linear,
                                      min_filter::linear);
    float3 yuv = float3(textureY.sample(textureSampler, textureCoor).r,
                        textureUV.sample(textureSampler, textureCoor).rg);
    
    return convertMatrix->matrix * (yuv + convertMatrix->offset);
}

由于我们需要分别提取Rgb和alpha，因此也是像上一章一样，这次分别调用Yuv转rgb方法提取：

fragment float4 fragmentShader(RasterizerData input [[ stage_in ]],
                               texture2d <float> textureY [[ texture(0) ]],
                               texture2d <float> textureUV [[ texture(1) ]],
                               constant HobenConvertMatrix *convertMatrix [[ buffer(0) ]]) {
    float3 rgb = rgbFromYuv(input.textureCoorRgb, textureY, textureUV, convertMatrix);
    float alpha = rgbFromYuv(input.textureCoorAlpha, textureY, textureUV, convertMatrix).r;
    return float4(rgb, alpha);
}

在OC层再传入对应的数值即可：

- (void)drawInMTKView:(MTKView *)view {
    id <MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];
    MTLRenderPassDescriptor *renderDesc = view.currentRenderPassDescriptor;
    CVMetalTextureRef pixelBuffer = [self.dataSource currentPixelBuffer];
    self.videoSize = CGSizeMake(CVPixelBufferGetWidth(pixelBuffer) / 2, CVPixelBufferGetHeight(pixelBuffer));
    
    id <MTLTexture> textureY = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatR8Unorm planeIndex:0];
    id <MTLTexture> textureUV = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatRG8Unorm planeIndex:1];
    [self setupMatrixWithPixelBuffer:pixelBuffer];
    if (pixelBuffer) {
        CFRelease(pixelBuffer);
    }
    if (!renderDesc || !textureY || !textureUV) {
        [commandBuffer commit];
        return;
    }
    renderDesc.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 0);
    [self setupVertexWithRenderDesc:renderDesc];
    id <MTLRenderCommandEncoder> commandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderDesc];
    [commandEncoder setViewport:(MTLViewport){0, 0, self.viewportSize.x, self.viewportSize.y, -1, 1}];
    [commandEncoder setRenderPipelineState:self.pipelineState];
    [commandEncoder setVertexBuffer:self.rgbVertices offset:0 atIndex:0];
    [commandEncoder setVertexBuffer:self.alphaVertices offset:0 atIndex:1];
    [commandEncoder setFragmentBuffer:self.convertMatrix offset:0 atIndex:0];
    [commandEncoder setFragmentTexture:textureY atIndex:0];
    [commandEncoder setFragmentTexture:textureUV atIndex:1];
    [commandEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:self.numVertices];
    [commandEncoder endEncoding];
    [commandBuffer presentDrawable:view.currentDrawable];
    [commandBuffer commit];
}

三. 总结

这篇文章主要介绍了视频和图片的区别，如何将视频的YUV转换成图片的RGB、转换的矩阵应该怎么选取，如何把视频的每一帧作为纹理渲染到屏幕中去。

下面这种图总结了怎么根据读取视频获得的CMSampleBufferRef一步步获得所需要纹理的RGB：

只要获取到RGB，我们就可以愉快地处理啦~下图介绍了怎么利用Yuv转rgb的思路，对输入的顶点进行采样处理，并作为片段着色器的输出（即采样左半边作为alpha、采样右半边作为rgb）

终于写了个需求相关的Demo出来啦~下一步想学习一下Metal的Debug和一些性能的反馈，研究一下和OpenGL相比的性能；还想钻研下MetalPetal大神级源码，还要继续加油呀！

附源码，这次按照苹果的建议对Metal的OC层进行了封装，Controller层一身轻松~

HobenMetalImageView：

//
//  HobenMetalImageView.h
//  HobenLearnMetal
//
//  Created by Hoben on 2021/1/12.
//

#import <UIKit/UIKit.h>
@import MetalKit;

NS_ASSUME_NONNULL_BEGIN

typedef NS_ENUM(NSUInteger, HobenRenderingResizingMode) {
    HobenRenderingResizingModeScale = 0,
    HobenRenderingResizingModeAspectFit,
    HobenRenderingResizingModeAspectFill,
};

@protocol HobenMetalImageViewDataSource <NSObject>

- (CVMetalTextureRef)currentPixelBuffer;

@end

@interface HobenMetalImageView : UIView

@property (nonatomic, assign) CVMetalTextureRef pixelBuffer;

@property (nonatomic, assign) CGSize videoSize;

@property (nonatomic, weak  ) id <HobenMetalImageViewDataSource> dataSource;

@end

NS_ASSUME_NONNULL_END

//
//  HobenMetalImageView.m
//  HobenLearnMetal
//
//  Created by Hoben on 2021/1/12.
//

#import "HobenMetalImageView.h"
#import <AVFoundation/AVFoundation.h>
#import "HobenShaderType.h"

// BT.601
static const HobenConvertMatrix HobenYUVColorConversion601 = {
    .matrix = {
        .columns[0] = { 1.164,  1.164, 1.164, },
        .columns[1] = { 0.000, -0.392, 2.017, },
        .columns[2] = { 1.596, -0.813, 0.000, },
    },
    .offset = { -(16.0/255.0), -0.5, -0.5 },
};

// BT.601 Full Range
static const HobenConvertMatrix HobenYUVColorConversion601FullRange = {
    .matrix = {
        .columns[0] = { 1.000,  1.000, 1.000, },
        .columns[1] = { 0.000, -0.343, 1.765, },
        .columns[2] = { 1.400, -0.711, 0.000, },
    },
    .offset = { 0.0, -0.5, -0.5 },
};

// BT.709
static const HobenConvertMatrix HobenYUVColorConversion709 = {
    .matrix = {
        .columns[0] = { 1.164,  1.164, 1.164, },
        .columns[1] = { 0.000, -0.213, 2.112, },
        .columns[2] = { 1.793, -0.533, 0.000, },
    },
    .offset = { -(16.0/255.0), -0.5, -0.5 },
};

@interface HobenMetalImageView () <MTKViewDelegate>

@property (nonatomic, strong) MTKView *mtkView;

@property (nonatomic, assign) vector_uint2 viewportSize;

@property (nonatomic, strong) id <MTLCommandQueue> commandQueue;

@property (nonatomic, strong) id <MTLDevice> device;

@property (nonatomic, strong) id <MTLRenderPipelineState> pipelineState;

@property (nonatomic, assign) CVMetalTextureCacheRef textureCache;

@property (nonatomic, strong) id <MTLBuffer> rgbVertices;

@property (nonatomic, strong) id <MTLBuffer> alphaVertices;

@property (nonatomic, strong) id<MTLBuffer> convertMatrix;

@property (nonatomic, assign) NSInteger numVertices;

@end

@implementation HobenMetalImageView

- (instancetype)initWithFrame:(CGRect)frame {
    if (self = [super initWithFrame:frame]) {
        [self setup];
    }
    return self;
}

- (void)setup {
    [self setupMTKView];
    [self setupCommandQueue];
    [self setupPipeline];
}

- (void)setupMTKView {
    self.mtkView = [[MTKView alloc] init];
    _device = self.mtkView.device = MTLCreateSystemDefaultDevice();
    self.mtkView.delegate = self;
    self.mtkView.frame = self.bounds;
    self.mtkView.opaque = NO;
    [self addSubview:self.mtkView];
    CVMetalTextureCacheCreate(NULL, NULL, _device, NULL, &_textureCache);
}

- (void)setupCommandQueue {
    _commandQueue = [_device newCommandQueue];
}

- (void)setupPipeline {
    id <MTLLibrary> defaultLibrary = [_device newDefaultLibrary];
    id <MTLFunction> vertexFunc = [defaultLibrary newFunctionWithName:@"vertexShader"];
    id <MTLFunction> fragmentFunc = [defaultLibrary newFunctionWithName:@"fragmentShader"];
    
    MTLRenderPipelineDescriptor *pipelineDesc = [[MTLRenderPipelineDescriptor alloc] init];
    pipelineDesc.colorAttachments[0].pixelFormat = self.mtkView.colorPixelFormat;
    pipelineDesc.vertexFunction = vertexFunc;
    pipelineDesc.fragmentFunction = fragmentFunc;
    self.pipelineState = [_device newRenderPipelineStateWithDescriptor:pipelineDesc error:nil];
}

#pragma mark - MTKViewDelegate

- (void)mtkView:(MTKView *)view drawableSizeWillChange:(CGSize)size {
    self.viewportSize = (vector_uint2){size.width, size.height};
}

- (void)drawInMTKView:(MTKView *)view {
    id <MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];
    MTLRenderPassDescriptor *renderDesc = view.currentRenderPassDescriptor;
    CVMetalTextureRef pixelBuffer = [self.dataSource currentPixelBuffer];
    self.videoSize = CGSizeMake(CVPixelBufferGetWidth(pixelBuffer) / 2, CVPixelBufferGetHeight(pixelBuffer));
    
    id <MTLTexture> textureY = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatR8Unorm planeIndex:0];
    id <MTLTexture> textureUV = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatRG8Unorm planeIndex:1];
    [self setupMatrixWithPixelBuffer:pixelBuffer];
    if (pixelBuffer) {
        CFRelease(pixelBuffer);
    }
    if (!renderDesc || !textureY || !textureUV) {
        [commandBuffer commit];
        return;
    }
    renderDesc.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 0);
    [self setupVertexWithRenderDesc:renderDesc];
    id <MTLRenderCommandEncoder> commandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderDesc];
    [commandEncoder setViewport:(MTLViewport){0, 0, self.viewportSize.x, self.viewportSize.y, -1, 1}];
    [commandEncoder setRenderPipelineState:self.pipelineState];
    [commandEncoder setVertexBuffer:self.rgbVertices offset:0 atIndex:0];
    [commandEncoder setVertexBuffer:self.alphaVertices offset:0 atIndex:1];
    [commandEncoder setFragmentBuffer:self.convertMatrix offset:0 atIndex:0];
    [commandEncoder setFragmentTexture:textureY atIndex:0];
    [commandEncoder setFragmentTexture:textureUV atIndex:1];
    [commandEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:self.numVertices];
    [commandEncoder endEncoding];
    [commandBuffer presentDrawable:view.currentDrawable];
    [commandBuffer commit];
}

- (void)setupVertexWithRenderDesc:(MTLRenderPassDescriptor *)renderDesc {
    if (self.rgbVertices && self.alphaVertices) {
        return;
    }
    float heightScaling = 1.0;
    float widthScaling = 1.0;
    CGSize drawableSize = CGSizeMake(renderDesc.colorAttachments[0].texture.width, renderDesc.colorAttachments[0].texture.height);
    CGRect bounds = CGRectMake(0, 0, drawableSize.width, drawableSize.height);
    CGRect insetRect = AVMakeRectWithAspectRatioInsideRect(self.videoSize, bounds);
    HobenRenderingResizingMode mode = CGSizeEqualToSize(self.videoSize, CGSizeZero) ? HobenRenderingResizingModeScale : HobenRenderingResizingModeAspectFill;
    switch (mode) {
        case HobenRenderingResizingModeScale:
            heightScaling = 1.0;
            widthScaling = 1.0;
            break;
            
        case HobenRenderingResizingModeAspectFit:
            widthScaling = insetRect.size.width / drawableSize.width;
            heightScaling = insetRect.size.height / drawableSize.height;
            break;
            
        case HobenRenderingResizingModeAspectFill:
            widthScaling = drawableSize.height / insetRect.size.height;
            heightScaling = drawableSize.width / insetRect.size.width;
            break;
    }
    HobenVertex alphaVertices[] = {
        // 顶点坐标 x, y, z, w  --- 纹理坐标 x, y
        { {-widthScaling,  heightScaling, 0.0, 1.0}, {0.0, 0.0} },
        { { widthScaling,  heightScaling, 0.0, 1.0}, {0.5, 0.0} },
        { {-widthScaling, -heightScaling, 0.0, 1.0}, {0.0, 1.0} },
        { { widthScaling, -heightScaling, 0.0, 1.0}, {0.5, 1.0} },
    };
    
    CGFloat offset = .5f;
    
    HobenVertex rgbVertices[] = {
        // 顶点坐标 x, y, z, w  --- 纹理坐标 x, y
        { {-widthScaling,  heightScaling, 0.0, 1.0}, {0.0 + offset, 0.0} },
        { { widthScaling,  heightScaling, 0.0, 1.0}, {0.5 + offset, 0.0} },
        { {-widthScaling, -heightScaling, 0.0, 1.0}, {0.0 + offset, 1.0} },
        { { widthScaling, -heightScaling, 0.0, 1.0}, {0.5 + offset, 1.0} },
    };
    self.rgbVertices = [_device newBufferWithBytes:rgbVertices length:sizeof(rgbVertices) options:MTLResourceStorageModeShared];
    self.numVertices = sizeof(rgbVertices) / sizeof(HobenVertex);
    self.alphaVertices = [_device newBufferWithBytes:alphaVertices length:sizeof(alphaVertices) options:MTLResourceStorageModeShared];
}

- (void)setupMatrixWithPixelBuffer:(CVMetalTextureRef)pixelBuffer { // 设置好转换的矩阵
    if (self.convertMatrix) {
        return;
    }
            
    OSType pixelFormatType = CVPixelBufferGetPixelFormatType(pixelBuffer);
    BOOL isFullYUVRange = (pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange ? YES : NO);
    
    CFTypeRef colorAttachments = CVBufferGetAttachment(pixelBuffer, kCVImageBufferYCbCrMatrixKey, NULL);
    HobenConvertMatrix preferredConversion = HobenYUVColorConversion601FullRange;
    
    if (colorAttachments != NULL) {
        if (CFStringCompare(colorAttachments, kCVImageBufferYCbCrMatrix_ITU_R_601_4, kCFCompareCaseInsensitive) == kCFCompareEqualTo) {
            if (isFullYUVRange) {
                preferredConversion = HobenYUVColorConversion601FullRange;
            } else {
                preferredConversion = HobenYUVColorConversion601;
            }
        } else {
            preferredConversion = HobenYUVColorConversion709;
        }
    } else {
        if (isFullYUVRange) {
            preferredConversion = HobenYUVColorConversion601FullRange;
        } else {
            preferredConversion = HobenYUVColorConversion601;
        }
    }
    
    self.convertMatrix = [self.mtkView.device newBufferWithBytes:&preferredConversion
                                                          length:sizeof(HobenConvertMatrix)
                                                         options:MTLResourceStorageModeShared];
}

- (id <MTLTexture>)textureWithPixelBuffer:(CVMetalTextureRef)pixelBuffer pixelFormat:(MTLPixelFormat)pixelFormat planeIndex:(NSInteger)planeIndex {
    id <MTLTexture> texture = nil;
    
    size_t width = CVPixelBufferGetWidthOfPlane(pixelBuffer, planeIndex);
    size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, planeIndex);
    CVMetalTextureRef textureRef = NULL;
    CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, pixelBuffer, NULL, pixelFormat, width, height, planeIndex, &textureRef);
    if (status == kCVReturnSuccess) {
        texture = CVMetalTextureGetTexture(textureRef);
        CFRelease(textureRef);
    } else {
        texture = nil;
    }
    return texture;
}

@end

HobenMetalImageController:

//
//  ViewController.m
//  HobenLearnMetal
//
//  Created by Hoben on 2021/1/4.
//

#import "HobenMetalImageController.h"
#import "HobenShaderType.h"
#import "HobenMetalImageView.h"
#import "HobenAssetReader.h"

@interface HobenMetalImageController () <HobenMetalImageViewDataSource>

@property (nonatomic, strong) HobenMetalImageView *metalImageView;

@property (nonatomic, strong) HobenAssetReader *assetReader;

@end

@implementation HobenMetalImageController

- (void)viewDidLoad {
    [super viewDidLoad];
    
    self.view.backgroundColor = [UIColor grayColor];
    
    self.metalImageView = [[HobenMetalImageView alloc] initWithFrame:self.view.bounds];
    self.metalImageView.dataSource = self;
    UIImageView *imageView = [[UIImageView alloc] initWithImage:[UIImage imageNamed:@"tangseng"]];
    imageView.frame = CGRectMake(0, 350, self.view.frame.size.width, 300);
    [self.view addSubview:imageView];
    [self.view addSubview:self.metalImageView];
    
    NSURL *url = [[NSBundle mainBundle] URLForResource:@"flower" withExtension:@"mp4"];
    self.assetReader = [[HobenAssetReader alloc] initWithUrl:url];
}

- (CVMetalTextureRef)currentPixelBuffer {
    CMSampleBufferRef movieSampleBuffer = [self.assetReader readBuffer];
    CVMetalTextureRef pixelBuffer = CMSampleBufferGetImageBuffer(movieSampleBuffer);
    return pixelBuffer;
}

@end

HobenAssetReader：

//
//  HobenAssetReader.m
//  HobenLearnMetal
//
//  Created by Hoben on 2021/1/12.
//

#import "HobenAssetReader.h"

@implementation HobenAssetReader
{
    AVAssetReaderTrackOutput *readerVideoTrackOutput;
    AVAssetReader   *assetReader;
    NSURL *videoUrl;
    NSLock *lock;
}

- (instancetype)initWithUrl:(NSURL *)url {
    self = [super init];
    videoUrl = url;
    lock = [[NSLock alloc] init];
    [self customInit];
    return self;
}

- (void)customInit {
    NSDictionary *inputOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] forKey:AVURLAssetPreferPreciseDurationAndTimingKey];
    AVURLAsset *inputAsset = [[AVURLAsset alloc] initWithURL:videoUrl options:inputOptions];
    __weak typeof(self) weakSelf = self;
    [inputAsset loadValuesAsynchronouslyForKeys:[NSArray arrayWithObject:@"tracks"] completionHandler: ^{
        dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
            NSError *error = nil;
            AVKeyValueStatus tracksStatus = [inputAsset statusOfValueForKey:@"tracks" error:&error];
            if (tracksStatus != AVKeyValueStatusLoaded)
            {
                NSLog(@"error %@", error);
                return;
            }
            [weakSelf processWithAsset:inputAsset];
        });
    }];
}

- (void)processWithAsset:(AVAsset *)asset
{
    [lock lock];
    NSLog(@"processWithAsset");
    NSError *error = nil;
    assetReader = [AVAssetReader assetReaderWithAsset:asset error:&error];
    
    NSMutableDictionary *outputSettings = [NSMutableDictionary dictionary];
    
    [outputSettings setObject:@(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange) forKey:(id)kCVPixelBufferPixelFormatTypeKey];
    
    readerVideoTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:[[asset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0] outputSettings:outputSettings];
    readerVideoTrackOutput.alwaysCopiesSampleData = NO;
    [assetReader addOutput:readerVideoTrackOutput];

    
    if ([assetReader startReading] == NO)
    {
        NSLog(@"Error reading from file at URL: %@", asset);
    }
    [lock unlock];
}

- (CMSampleBufferRef)readBuffer {
    [lock lock];
    CMSampleBufferRef sampleBufferRef = nil;
    
    if (readerVideoTrackOutput) {
        sampleBufferRef = [readerVideoTrackOutput copyNextSampleBuffer];
    }
    
    if (assetReader && assetReader.status == AVAssetReaderStatusCompleted) {
        NSLog(@"customInit");
        readerVideoTrackOutput = nil;
        assetReader = nil;
        [self customInit];
    }
    
    [lock unlock];
    return sampleBufferRef;
}

@end

Shaders.metal：

//
//  Shaders.metal
//  HobenLearnMetal
//
//  Created by Hoben on 2021/1/4.
//

#include <metal_stdlib>
#import "HobenShaderType.h"

using namespace metal;

typedef struct {
    float4 vertexPosition [[ position ]];   // 顶点坐标
    float2 textureCoorRgb;                  // 取RGB值的纹理坐标
    float2 textureCoorAlpha;                // 取Alpha值的纹理坐标
} RasterizerData;

vertex RasterizerData vertexShader(uint vertexId [[ vertex_id ]],
                                   constant HobenVertex *rgbVertexArray [[ buffer(0) ]],
                                   constant HobenVertex *alphaVertexArray [[ buffer(1) ]]) {
    RasterizerData out;
    out.vertexPosition = rgbVertexArray[vertexId].position;
    out.textureCoorRgb = rgbVertexArray[vertexId].textureCoordinate;
    out.textureCoorAlpha = alphaVertexArray[vertexId].textureCoordinate;
    return out;
}

float3 rgbFromYuv(float2 textureCoor,
                  texture2d <float> textureY,
                  texture2d <float> textureUV,
                  constant HobenConvertMatrix *convertMatrix) {
    
    constexpr sampler textureSampler (mag_filter::linear,
                                      min_filter::linear);
    float3 yuv = float3(textureY.sample(textureSampler, textureCoor).r,
                        textureUV.sample(textureSampler, textureCoor).rg);
    
    return convertMatrix->matrix * (yuv + convertMatrix->offset);
}

fragment float4 fragmentShader(RasterizerData input [[ stage_in ]],
                               texture2d <float> textureY [[ texture(0) ]],
                               texture2d <float> textureUV [[ texture(1) ]],
                               constant HobenConvertMatrix *convertMatrix [[ buffer(0) ]]) {
    float3 rgb = rgbFromYuv(input.textureCoorRgb, textureY, textureUV, convertMatrix);
    float alpha = rgbFromYuv(input.textureCoorAlpha, textureY, textureUV, convertMatrix).r;
    return float4(rgb, alpha);
}

//
//  HobenShaderType.h
//  HobenLearnMetal
//
//  Created by Hoben on 2021/1/4.
//

#ifndef HobenShaderType_h
#define HobenShaderType_h

#include <simd/simd.h>

typedef struct {
    vector_float4 position;
    vector_float2 textureCoordinate;
} HobenVertex;

typedef struct {
    matrix_float3x3 matrix;
    vector_float3 offset;
} HobenConvertMatrix;

#endif /* HobenShaderType_h */