Metal框架详细解析（四十二） —— Metal编程指南之图形

2018-11-10 本文已影响73人刀客传奇

版本记录

版本号	时间
V1.0	2018.11.10 星期六

前言

很多做视频和图像的，相信对这个框架都不是很陌生，它渲染高级3D图形，并使用GPU执行数据并行计算。接下来的几篇我们就详细的解析这个框架。感兴趣的看下面几篇文章。
1. Metal框架详细解析（一）—— 基本概览
 2. Metal框架详细解析（二） —— 器件和命令（一）
3. Metal框架详细解析（三） —— 渲染简单的2D三角形（一）
4. Metal框架详细解析（四） —— 关于GPU Family 4（一）
5. Metal框架详细解析（五） —— 关于GPU Family 4之关于Imageblocks（二）
6. Metal框架详细解析（六） —— 关于GPU Family 4之关于Tile Shading（三）
7. Metal框架详细解析（七） —— 关于GPU Family 4之关于光栅顺序组（四）
8. Metal框架详细解析（八） —— 关于GPU Family 4之关于增强的MSAA和Imageblock采样覆盖控制（五）
9. Metal框架详细解析（九） —— 关于GPU Family 4之关于线程组共享（六）
10. Metal框架详细解析（十） —— 基本组件（一）
11. Metal框架详细解析（十一） —— 基本组件之器件选择 - 图形渲染的器件选择（二）
12. Metal框架详细解析（十二） —— 基本组件之器件选择 - 计算处理的设备选择（三）
13. Metal框架详细解析（十三） —— 计算处理（一）
14. Metal框架详细解析（十四） —— 计算处理之你好，计算（二）
15. Metal框架详细解析（十五） —— 计算处理之关于线程和线程组（三）
16. Metal框架详细解析（十六） —— 计算处理之计算线程组和网格大小（四）
17. Metal框架详细解析（十七） —— 工具、分析和调试（一）
18. Metal框架详细解析（十八） —— 工具、分析和调试之Metal GPU Capture（二）
19. Metal框架详细解析（十九） —— 工具、分析和调试之GPU活动监视器（三）
20. Metal框架详细解析（二十） —— 工具、分析和调试之关于Metal着色语言文件名扩展名、使用Metal的命令行工具构建库和标记Metal对象和命令（四）
21. Metal框架详细解析（二十一） —— 基本课程之基本缓冲区（一）
22. Metal框架详细解析（二十二） —— 基本课程之基本纹理（二）
23. Metal框架详细解析（二十三） —— 基本课程之CPU和GPU同步（三）
24. Metal框架详细解析（二十四） —— 基本课程之参数缓冲 - 基本参数缓冲（四）
25. Metal框架详细解析（二十五） —— 基本课程之参数缓冲 - 带有数组和资源堆的参数缓冲区（五）
26. Metal框架详细解析（二十六） —— 基本课程之参数缓冲 - 具有GPU编码的参数缓冲区（六）
27. Metal框架详细解析（二十七） —— 高级技术之图层选择的反射（一）
28. Metal框架详细解析（二十八） —— 高级技术之使用专用函数的LOD（一）
29. Metal框架详细解析（二十九） —— 高级技术之具有参数缓冲区的动态地形（一）
30. Metal框架详细解析（三十） —— 延迟照明（一）
31. Metal框架详细解析（三十一） —— 在视图中混合Metal和OpenGL渲染（一）
32. Metal框架详细解析（三十二） —— Metal渲染管道教程（一）
33. Metal框架详细解析（三十三） —— Metal渲染管道教程（二）
34. Metal框架详细解析（三十四） —— Hello Metal！一个简单的三角形的实现（一）
35. Metal框架详细解析（三十五） —— Hello Metal！一个简单的三角形的实现（二）
36. Metal框架详细解析（三十六） —— Metal编程指南之概览（一）
37. Metal框架详细解析（三十七） —— Metal编程指南之基本Metal概念（二）
38. Metal框架详细解析（三十八） —— Metal编程指南之命令组织和执行模型（三）
39. Metal框架详细解析（三十九） —— Metal编程指南之资源对象：缓冲区和纹理（四）
40. Metal框架详细解析（四十） —— Metal编程指南之函数和库（五）
41. Metal框架详细解析（四十一） —— Metal编程指南之图形渲染：渲染命令编码器之Part 1（六）

Specifying Resources for a Render Command Encoder - 为渲染命令编码器指定资源

本节中讨论的MTLRenderCommandEncoder方法指定用作顶点和片段着色器函数的参数的资源，这些函数由MTLRenderPipelineState对象中的vertexFunction和fragmentFunction属性指定。这些方法将着色器资源（缓冲区，纹理和采样器）分配给渲染命令编码器中对应的参数表索引（atIndex），如图5-3所示。

Figure 5-3 Argument Tables for the Render Command Encoder

以下setVertex *方法将一个或多个资源分配给顶点着色器函数的相应参数。

这些setFragment *方法类似地将一个或多个资源分配给片段着色器函数的相应参数。

缓冲区参数表中最多有31个条目，纹理参数表中有31个条目，采样器状态参数表中有16个条目。

在Metal着色语言源代码中指定资源位置的属性限定符必须与Metal框架方法中的参数表索引匹配。在Listing 5-7中，为顶点着色器定义了两个分别为索引0和1的缓冲区（posBuf和texCoordBuf）

Listing 5-7  Metal Framework: Specifying Resources for a Vertex Function

[renderEnc setVertexBuffer:posBuf offset:0 atIndex:0];
[renderEnc setVertexBuffer:texCoordBuf offset:0 atIndex:1];

在Listing 5-8中，函数签名具有与属性限定符buffer（0）和buffer（1）相对应的参数。

Listing 5-8  Metal Shading Language: Vertex Function Arguments Match the Framework Argument Table Indices

vertex VertexOutput metal_vert(float4 *posData [[ buffer(0) ]],
                               float2 *texCoordData [[ buffer(1) ]])

类似地，在Listing 5-9中，为片段着色器定义了一个缓冲区，一个纹理和一个采样器（分别为fragmentColorBuf，shadeTex和sampler），它们都具有索引0。

Listing 5-9  Metal Framework: Specifying Resources for a Fragment Function

[renderEnc setFragmentBuffer:fragmentColorBuf offset:0 atIndex:0];
[renderEnc setFragmentTexture:shadeTex atIndex:0];
[renderEnc setFragmentSamplerState:sampler atIndex:0];

在Listing 5-10中，函数签名分别具有属性限定符buffer（0），texture（0）和sampler（0）的对应参数。

Listing 5-10  Metal Shading Language: Fragment Function Arguments Match the Framework Argument Table Indices

fragment float4 metal_frag(VertexOutput in [[stage_in]],
                           float4 *fragColorData [[ buffer(0) ]],
                           texture2d<float> shadeTexValues [[ texture(0) ]],
                           sampler samplerValues [[ sampler(0) ]])

1. Vertex Descriptor for Data Organization - 数据组织的顶点描述符

在Metal框架代码中，每个管道状态可以有一个MTLVertexDescriptor，用于描述输入到顶点着色器函数的数据的组织，并在着色语言和框架代码之间共享资源位置信息。

在Metal着色语言代码中，每个顶点输入（例如标量或整数或浮点值向量）可以组织在一个结构中，该结构可以在一个使用[[stage_in]]属性限定符声明的参数中传递，如Listing 5-11中示例顶点函数vertexMath的VertexInput结构中所示。每顶点输入结构的每个字段都有[[attribute（index）]]限定符，它指定顶点属性参数表中的索引。

Listing 5-11  Metal Shading Language: Vertex Function Inputs with Attribute Indices

struct VertexInput {
    float2    position [[ attribute(0) ]];
    float4    color    [[ attribute(1) ]];
    float2    uv1      [[ attribute(2) ]];
    float2    uv2      [[ attribute(3) ]];
};

struct VertexOutput {
    float4 pos [[ position ]];
    float4 color;
};

vertex VertexOutput vertexMath(VertexInput in [[ stage_in ]])
{
  VertexOutput out;
  out.pos = float4(in.position.x, in.position.y, 0.0, 1.0);

  float sum1 = in.uv1.x + in.uv2.x;
  float sum2 = in.uv1.y + in.uv2.y;
  out.color = in.color + float4(sum1, sum2, 0.0f, 0.0f);
  return out;
}

要使用[[stage_in]]限定符引用着色器函数输入，请描述MTLVertexDescriptor对象，然后将其设置为MTLRenderPipelineState的vertexDescriptor属性。 MTLVertexDescriptor有两个属性：attributes和layouts。

MTLVertexDescriptor的attributes属性是一个MTLVertexAttributeDescriptorArray对象，它定义每个顶点属性在映射到顶点函数参数的缓冲区中的组织方式。 attributes属性可以支持访问在同一缓冲区中交错的多个属性（例如顶点坐标，曲面法线和纹理坐标）。着色语言代码中成员的顺序不必保留在框架代码的缓冲区中。数组中的每个顶点属性描述符都具有以下属性，这些属性提供顶点着色器函数信息以定位和加载参数数据：

bufferIndex，它是缓冲区参数表的索引，用于指定访问哪个MTLBuffer。在 Specifying Resources for a Render Command Encoder中讨论了缓冲区参数表。
format，指定如何在框架代码中解释数据。如果数据类型不是精确类型匹配，则可以转换或扩展它。例如，如果着色语言类型为half4且框架格式为MTLVertexFormatFloat2，那么当数据用作顶点函数的参数时，它可以从float转换为一半并从两个元素扩展为四个元素（使用0.0，最后两个元素中的1.0）。
offset，指定从顶点的开头可以找到数据的位置。

图5-4说明了Metal框架代码中的MTLVertexAttributeDescriptorArray，它实现了一个交错缓冲区，该缓冲区对应于Listing 5-11中着色语言代码中顶点函数vertexMath的输入。

Figure 5-4 Buffer Organization with Vertex Attribute Descriptors

Listing 5-12显示了与图5-4中所示的交错缓冲区相对应的Metal框架代码。

Listing 5-12  Metal Framework: Using a Vertex Descriptor to Access Interleaved Data

id <MTLFunction> vertexFunc = [library newFunctionWithName:@"vertexMath"];            
MTLRenderPipelineDescriptor* pipelineDesc =      
                             [[MTLRenderPipelineDescriptor alloc] init];
MTLVertexDescriptor* vertexDesc = [[MTLVertexDescriptor alloc] init];

vertexDesc.attributes[0].format = MTLVertexFormatFloat2;
vertexDesc.attributes[0].bufferIndex = 0;
vertexDesc.attributes[0].offset = 0;
vertexDesc.attributes[1].format = MTLVertexFormatFloat4;
vertexDesc.attributes[1].bufferIndex = 0;
vertexDesc.attributes[1].offset = 2 * sizeof(float);  // 8 bytes
vertexDesc.attributes[2].format = MTLVertexFormatFloat2;
vertexDesc.attributes[2].bufferIndex = 0;
vertexDesc.attributes[2].offset = 8 * sizeof(float);  // 32 bytes
vertexDesc.attributes[3].format = MTLVertexFormatFloat2;
vertexDesc.attributes[3].bufferIndex = 0;
vertexDesc.attributes[3].offset = 6 * sizeof(float);  // 24 bytes
vertexDesc.layouts[0].stride = 10 * sizeof(float);    // 40 bytes
vertexDesc.layouts[0].stepFunction = MTLVertexStepFunctionPerVertex;

pipelineDesc.vertexDescriptor = vertexDesc;
pipelineDesc.vertexFunction = vertFunc;

MTLVertexDescriptor对象的attributes数组中的每个MTLVertexAttributeDescriptor对象对应于着色器函数中VertexInput中的索引结构成员。 attributes [1] .bufferIndex = 0指定在参数表中使用索引0处的缓冲区。（在此示例中，每个MTLVertexAttributeDescriptor具有相同的bufferIndex，因此每个引用参数表中索引0处的相同顶点缓冲区。）。offset指定顶点内数据的位置，因此attributes[1].offset = 2 * sizeof(float)定位从缓冲区起始处开始的相应数据的8个字节。选择format值以匹配着色器函数中的数据类型，因此attributes[1].format = MTLVertexFormatFloat4指定使用四个浮点值。

MTLVertexDescriptor的layouts属性是MTLVertexBufferLayoutDescriptorArray。对于layouts中的每个MTLVertexBufferLayoutDescriptor，属性指定在Metal绘制基元时如何从参数表中的相应MTLBuffer获取顶点和属性数据。（有关绘制图元的更多信息，请参见Drawing Geometric Primitives。）。MTLVertexBufferLayoutDescriptor的stepFunction属性确定是为每个顶点，某些实例获取属性数据，还是仅获取一次。如果将stepFunction设置为获取某些实例的属性数据，则MTLVertexBufferLayoutDescriptor的stepRate属性将确定实例数。 stride属性指定两个顶点的数据之间的距离（以字节为单位）。

图5-5描述了MTLVertexBufferLayoutDescriptor，它对应于 Listing 5-12中的代码。 layouts [0]指定如何从缓冲区参数表中的相应索引0获取顶点数据。 layouts [0] .stride指定两个顶点的数据之间的距离为40个字节。 layouts [0] .stepFunction，MTLVertexStepFunctionPerVertex的值指定在绘制时为每个顶点提取属性数据。如果stepFunction的值为MTLVertexStepFunctionPerInstance，则stepRate属性确定获取属性数据的频率。例如，如果stepRate为1，则为每个实例提取数据；如果stepRate为2，则每两个实例，依此类推。

Figure 5-5 Buffer Organization with Vertex Buffer Layout Descriptors

Performing Fixed-Function Render Command Encoder Operations - 执行固定功能渲染命令编码器操作

使用这些MTLRenderCommandEncoder方法设置固定功能图形状态值：

setViewport:以屏幕坐标指定区域，该区域是虚拟3D世界投影的目标。视口是3D，因此它包含深度值；有关详细信息，请参阅 Working with Viewport and Pixel Coordinate Systems。
setTriangleFillMode:确定是否使用直线（MTLTriangleFillModeLines）或填充三角形（MTLTriangleFillModeFill）栅格化三角形和三角形条带基元。默认值为MTLTriangleFillModeFill。
setCullMode:和setFrontFacingWinding:一起用于确定是否以及如何应用剔除。您可以在某些几何模型上使用剔除隐藏曲面去除，例如使用实心三角形渲染的可定向球体。（如果其基元始终以顺时针或逆时针顺序绘制，则表面可定向。）
- setFrontFacingWinding:的值指示正面基元是否以顺时针（MTLWindingClockwise）或逆时针（MTLWindingCounterClockwise）顺序绘制顶点。默认值为MTLWindingClockwise。
- setCullMode:的值确定是否执行剔除（MTLCullModeNone，如果禁用剔除）或者要剔除哪种类型的原语（MTLCullModeFront或MTLCullModeBack）。

使用以下MTLRenderCommandEncoder方法对固定函数状态更改命令进行编码：

setScissorRect:指定2D剪刀矩形。位于指定剪刀矩形之外的碎片将被丢弃。
setDepthStencilState:设置深度和模板测试状态，如 Depth and Stencil States中所述。
setStencilReferenceValue:指定模板参考值。
setDepthBias:slopeScale:clamp:指定用于将阴影贴图与片段着色器输出的深度值进行比较的调整。
setVisibilityResultMode:offset:确定是否监视任何样本是否通过深度和模板测试。如果设置为MTLVisibilityResultModeBoolean，则如果任何样本通过深度和模板测试，则会将非零值写入由MTLRenderPassDescriptor的visibilityResultBuffer属性指定的缓冲区，如Creating a Render Pass Descriptor中所述。

您可以使用此模式执行遮挡测试。如果绘制边界框并且没有样本通过，则可以得出结论，该边界框内的任何对象都被遮挡，因此不需要渲染。

setBlendColorRed:green:blue:alpha:指定常量混合颜色和alpha值，详见Configuring Blending in a Render Pipeline Attachment Descriptor。

1. Working with Viewport and Pixel Coordinate Systems - 使用Viewport和像素坐标系

Metal将其标准化设备坐标（Normalized Device Coordinate - NDC）系统定义为2x2x1立方体，其中心位于（0,0,0.5）。 NDC系统的x和y的左侧和底部分别指定为-1。 NDC系统的x和y的右侧和顶部分别指定为+1。

视口指定从NDC到窗口坐标的转换。 Metal视口是由MTLRenderCommandEncoder的setViewport:方法指定的3D转换。窗口坐标的原点位于左上角。

在Metal中，像素中心偏移（0.5,0.5）。例如，原点处的像素的中心位于（0.5,0.5）; 右边相邻像素的中心是（1.5,0.5）。纹理也是如此。

2. Performing Depth and Stencil Operations - 执行深度和模板操作

深度和模板操作是您指定的片段操作，如下所示：

指定包含深度/模板状态设置的自定义MTLDepthStencilDescriptor对象。创建自定义MTLDepthStencilDescriptor对象可能需要创建一个或两个适用于前向基元和后向基元的TLStencilDescriptor对象。
通过使用深度/模板状态描述符调用MTDDevice的newDepthStencilStateWithDescriptor:方法来创建MTLDepthStencilState对象。
要设置深度/模板状态，请使用MTLRenderCommandEncoder的带有MTLDepthStencilState值的setDepthStencilState:方法。
如果正在使用模板测试，请调用setStencilReferenceValue:以指定模板参考值。

如果启用了深度测试，则渲染管道状态必须包含深度附件以支持写入深度值。要执行模板测试，渲染管道状态必须包含模板附件。要配置附件，请参阅Creating and Configuring a Render Pipeline Descriptor。

如果要定期更改深度/模板状态，则可能需要重用状态描述符对象，根据需要修改其属性值以创建更多状态对象。

注意：要从着色器函数中的深度格式纹理进行采样，请在着色器中实施采样操作，而不使用MTLSamplerState。

使用MTLDepthStencilDescriptor对象的属性，如下所示设置深度和模板状态：

要将深度值写入深度附件，请将depthWriteEnabled设置为YES。
depthCompareFunction指定深度测试的执行方式。如果片段的深度值未通过深度测试，则丢弃片段。例如，常用的MTLCompareFunctionLess函数导致比（先前写入的）像素深度值更远离观察者的片段值无法进行深度测试；也就是说，片段被较早的深度值视为被遮挡。
frontFaceStencil和backFaceStencil属性均为前向和后向基元指定单独的MTLStencilDescriptor对象。要对前置和后置基元使用相同的模板状态，可以为frontFaceStencil和backFaceStencil属性分配相同的MTLStencilDescriptor。要显式禁用一个或两个面的模板测试，请将相应的属性设置为nil（默认值）。

不必显式禁用模板状态。 Metal根据是否为有效的模板操作配置模板描述符来确定是否启用模板测试。

Listing 5-13显示了创建和使用MTLDepthStencilDescriptor对象以创建MTLDepthStencilState对象的示例，该对象随后与渲染命令编码器一起使用。在此示例中，从深度/模板状态描述符的frontFaceStencil属性访问前置基元的模板状态。对于背面基元，显式禁用模板测试

Listing 5-13  Creating and Using a Depth/Stencil Descriptor

MTLDepthStencilDescriptor *dsDesc = [[MTLDepthStencilDescriptor alloc] init];
if (dsDesc == nil)
     exit(1);   //  if the descriptor could not be allocated
dsDesc.depthCompareFunction = MTLCompareFunctionLess;
dsDesc.depthWriteEnabled = YES;
 
dsDesc.frontFaceStencil.stencilCompareFunction = MTLCompareFunctionEqual;
dsDesc.frontFaceStencil.stencilFailureOperation = MTLStencilOperationKeep;
dsDesc.frontFaceStencil.depthFailureOperation = MTLStencilOperationIncrementClamp;
dsDesc.frontFaceStencil.depthStencilPassOperation =
                          MTLStencilOperationIncrementClamp;
dsDesc.frontFaceStencil.readMask = 0x1;
dsDesc.frontFaceStencil.writeMask = 0x1;
dsDesc.backFaceStencil = nil;
id <MTLDepthStencilState> dsState = [device
                          newDepthStencilStateWithDescriptor:dsDesc];
 
[renderEnc setDepthStencilState:dsState];
[renderEnc setStencilReferenceValue:0xFF];

以下属性在MTLStencilDescriptor中定义模板测试：

readMask是一个位掩码；GPU使用模板参考值和存储的模板值计算此掩码的按位AND。模板测试是在得到的掩码参考值和掩码存储值之间的比较。
writeMask是一个位掩码，用于限制模板操作将哪些模板值写入模板附件。
stencilCompareFunction指定如何对片段执行模板测试。在Listing 5-13中，模板比较函数是MTLCompareFunctionEqual，因此如果掩码的参考值等于已经存储在片段位置的掩码模板值，则模板测试通过。
stencilFailureOperation, depthFailureOperation
和depthStencilPassOperation指定如何对模板附件中存储的模板值进行三种不同的测试结果：如果模板测试失败，如果模板测试通过，如果深度测试失败，或者如果模板和深度测试分别成功。在前面的示例中，如果模板测试失败，则模板值不变（MTLStencilOperationKeep），但如果模板测试通过则模板值会增加，除非模板值已经是最大可能值（MTLStencilOperationIncrementClamp）。

Drawing Geometric Primitives - 绘制几何图元

建立管道状态和固定功能状态后，可以调用以下MTLRenderCommandEncoder方法来绘制几何图元。这些绘制方法引用资源（例如包含顶点坐标，纹理坐标，曲面法线和其他数据的缓冲区）来执行具有着色器函数的管道以及先前使用 MTLRenderCommandEncoder建立的其他状态。

drawPrimitives:vertexStart:vertexCount:instanceCount:使用连续数组元素中的顶点数据呈现基元的多个实例（instanceCount），从索引vertexStart处数组元素的第一个顶点开始，到索引vertexStart + vertexCount - 1处的数组元素结束。
drawPrimitives:vertexStart:vertexCount:与instanceCount为1的前一个方法相同。
drawIndexedPrimitives:indexCount:indexType:indexBuffer:indexBufferOffset:instanceCount:使用MTLBuffer对象indexBuffer中指定的索引列表呈现基元的多个实例（instanceCount）。 indexCount确定索引的数量。索引列表从索引开始，索引是indexBuffer中数据中的indexBufferOffset字节偏移量。 indexBufferOffset必须是索引大小的倍数，由indexType决定。
drawIndexedPrimitives:indexCount:indexType:indexBuffer:indexBufferOffset:类似于上一个方法，instanceCount为1。

对于上面列出的每个基本渲染方法，第一个输入值使用MTLPrimitiveType值之一确定基元类型。其他输入值确定用于组合基元的顶点。对于所有这些方法，instanceStart输入值确定要绘制的第一个实例，而instanceCount输入值确定要绘制的实例数。

如前所述，setTriangleFillMode:确定三角形是渲染为填充还是线框，setCullMode:和setFrontFacingWinding:设置确定GPU在渲染过程中是否剔除三角形。有关更多信息，请参阅Fixed-Function State Operations）。

渲染点基元时，顶点函数的着色器语言代码必须提供[[point_size]]属性，或者点大小未定义。

渲染具有平面着色的三角形图元时，第一个顶点（也称为激发顶点）的属性用于整个三角形。顶点函数的着色器语言代码必须提供[[flat]]插值限定符。

有关所有Metal着色语言属性和限定符的详细信息，请参见Metal Shading Language Guide。

Ending a Rendering Pass - 结束渲染过程

要终止渲染过程，请在渲染命令编码器上调用endEncoding。在结束上一个命令编码器之后，您可以创建任何类型的新命令编码器，以将其他命令编码到命令缓冲区中。

Code Example: Drawing a Triangle - 代码示例：绘制三角形

Listing 5-14中所示的以下步骤描述了渲染三角形的基本过程。

1）创建一个MTLCommandQueue并使用它来创建MTLCommandBuffer。
2）创建一个MTLRenderPassDescriptor，它指定一组附件，这些附件用作命令缓冲区中编码的渲染命令的目标。

在此示例中，仅设置和使用第一个颜色附件。（假设变量currentTexture包含用于颜色附件的MTLTexture。）然后MTLRenderPassDescriptor用于创建新的MTLRenderCommandEncoder。

3）创建两个MTLBuffer对象，posBuf和colBuf，并调用newBufferWithBytes:length:options:分别将顶点坐标和顶点颜色数据posData和colData复制到缓冲区存储中。
4）调用MTLRenderCommandEncoder的setVertexBuffer:offset:atIndex:方法两次以指定坐标和颜色。

setVertexBuffer:offset:atIndex:方法的atIndex输入值对应于顶点函数的源代码中的属性缓冲区（atIndex）。

5）创建MTLRenderPipelineDescriptor并在管道描述符中建立顶点和片段函数：
- 使用progSrc中的源代码创建一个MTLLibrary，它被假定为包含Metal着色器源代码的字符串。
- 然后调用MTLLibrary的newFunctionWithName:方法创建MTLFunction vertFunc，它表示名为hello_vertex的函数，并创建表示名为hello_fragment的函数的MTLFunctionfragFunc。
- 最后，使用这些MTLFunction对象设置MTLRenderPipelineDescriptor的vertexFunction和fragmentFunction属性。
6）通过调用newRenderPipelineStateWithDescriptor:error:或MTLDevice的类似方法，从MTLRenderPipelineDescriptor创建MTLRenderPipelineState。然后，MTLRenderCommandEncoder的setRenderPipelineState:方法使用创建的管道状态进行渲染。
7）调用MTLRenderCommandEncoder的drawPrimitives:vertexStart:vertexCount:方法来附加命令以执行填充三角形的渲染（类型MTLPrimitiveTypeTriangle）。
8）调用endEncoding方法以结束此呈现过程的编码。并调用MTLCommandBuffer的commit方法来执行设备上的命令。

Listing 5-14  Metal Code for Drawing a Triangle

id <MTLDevice> device = MTLCreateSystemDefaultDevice();
 
id <MTLCommandQueue> commandQueue = [device newCommandQueue];
id <MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer];
 
MTLRenderPassDescriptor *renderPassDesc
                               = [MTLRenderPassDescriptor renderPassDescriptor];
renderPassDesc.colorAttachments[0].texture = currentTexture;
renderPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear;
renderPassDesc.colorAttachments[0].clearColor = MTLClearColorMake(0.0,1.0,1.0,1.0);
id <MTLRenderCommandEncoder> renderEncoder =
           [commandBuffer renderCommandEncoderWithDescriptor:renderPassDesc];
 
static const float posData[] = {
        0.0f, 0.33f, 0.0f, 1.f,
        -0.33f, -0.33f, 0.0f, 1.f,
        0.33f, -0.33f, 0.0f, 1.f,
};
static const float colData[] = {
        1.f, 0.f, 0.f, 1.f,
        0.f, 1.f, 0.f, 1.f,
        0.f, 0.f, 1.f, 1.f,
};
id <MTLBuffer> posBuf = [device newBufferWithBytes:posData
        length:sizeof(posData) options:nil];
id <MTLBuffer> colBuf = [device newBufferWithBytes:colorData
        length:sizeof(colData) options:nil];
[renderEncoder setVertexBuffer:posBuf offset:0 atIndex:0];
[renderEncoder setVertexBuffer:colBuf offset:0 atIndex:1];
 
NSError *errors;
id <MTLLibrary> library = [device newLibraryWithSource:progSrc options:nil
                           error:&errors];
id <MTLFunction> vertFunc = [library newFunctionWithName:@"hello_vertex"];
id <MTLFunction> fragFunc = [library newFunctionWithName:@"hello_fragment"];
MTLRenderPipelineDescriptor *renderPipelineDesc
                                   = [[MTLRenderPipelineDescriptor alloc] init];
renderPipelineDesc.vertexFunction = vertFunc;
renderPipelineDesc.fragmentFunction = fragFunc;
renderPipelineDesc.colorAttachments[0].pixelFormat = currentTexture.pixelFormat;
id <MTLRenderPipelineState> pipeline = [device
             newRenderPipelineStateWithDescriptor:renderPipelineDesc error:&errors];
[renderEncoder setRenderPipelineState:pipeline];
[renderEncoder drawPrimitives:MTLPrimitiveTypeTriangle
               vertexStart:0 vertexCount:3];
[renderEncoder endEncoding];
[commandBuffer commit];

在Listing 5-14中，MTLFunction对象表示名为hello_vertex的着色器函数。MTLRenderCommandEncoder的方法setVertexBuffer:offset:atIndex:用于指定作为参数传递给 hello_vertex的顶点资源（在本例中为两个缓冲区对象）。setVertexBuffer:offset:atIndex:方法的atIndex输入值对应顶点函数源代码中的属性 buffer(atIndex)，如Listing 5-15所示。

Listing 5-15  Corresponding Shader Function Declaration

vertex VertexOutput hello_vertex(
                    const global float4 *pos_data [[ buffer(0) ]],
                    const global float4 *color_data [[ buffer(1) ]])
{
    ...
}

Encoding a Single Rendering Pass Using Multiple Threads - 使用多个线程编码单个渲染通道

在某些情况下，单个渲染过程的编码命令的单CPU工作负载可能会限制应用程序的性能。但是，尝试通过将工作负载分成多个CPU线程上编码的多个渲染通道来绕过此瓶颈也会对性能产生负面影响，因为每个渲染过程都需要其自己的中间附件存储和加载操作来保留渲染目标内容。

而是使用MTLParallelRenderCommandEncoder对象，该对象管理共享相同命令缓冲区和渲染传递描述符的多个从属MTLRenderCommandEncoder对象。并行渲染命令编码器确保附件加载和存储操作仅在整个渲染过程的开始和结束时发生，而不是在每个从属渲染命令编码器的命令集的开始和结束时发生。使用此体系结构，您可以以安全且高性能的方式并行地将每个MTLRenderCommandEncoder对象分配给其自己的线程。

要创建并行渲染命令编码器，请使用MTLCommandBuffer对象的parallelRenderCommandEncoderWithDescriptor:方法。要创建从属渲染命令编码器，请为要执行命令编码的每个CPU线程调用一次MTLParallelRenderCommandEncoder对象的renderCommandEncoder方法。从同一并行渲染命令编码器创建的所有从属命令编码器将命令编码到同一命令缓冲区。命令按照创建渲染命令编码器的顺序编码到命令缓冲区。要结束特定渲染命令编码器的编码，请调用MTLRenderCommandEncoder的endEncoding方法。在并行渲染命令编码器创建的所有渲染命令编码器上结束编码后，调用MTLParallelRenderCommandEncoder的endEncoding方法以结束渲染过程。

Listing 5-16显示MTLParallelRenderCommandEncoder创建三个MTLRenderCommandEncoder对象：rCE1，rCE2和rCE3。

Listing 5-16  A Parallel Rendering Encoder with Three Render Command Encoders

MTLRenderPassDescriptor *renderPassDesc 
                     = [MTLRenderPassDescriptor renderPassDescriptor];
renderPassDesc.colorAttachments[0].texture = currentTexture;
renderPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear;
renderPassDesc.colorAttachments[0].clearColor = MTLClearColorMake(0.0,0.0,0.0,1.0);

id <MTLParallelRenderCommandEncoder> parallelRCE = [commandBuffer 
                     parallelRenderCommandEncoderWithDescriptor:renderPassDesc];
id <MTLRenderCommandEncoder> rCE1 = [parallelRCE renderCommandEncoder];
id <MTLRenderCommandEncoder> rCE2 = [parallelRCE renderCommandEncoder];
id <MTLRenderCommandEncoder> rCE3 = [parallelRCE renderCommandEncoder];

//  not shown: rCE1, rCE2, and rCE3 call methods to encode graphics commands
//
//  rCE1 commands are processed first, because it was created first
//  even though rCE2 and rCE3 end earlier than rCE1
[rCE2 endEncoding];
[rCE3 endEncoding];
[rCE1 endEncoding];

//  all MTLRenderCommandEncoders must end before MTLParallelRenderCommandEncoder
[parallelRCE endEncoding];

命令编码器调用endEncoding的顺序与命令编码和附加到MTLCommandBuffer的顺序无关。对于MTLParallelRenderCommandEncoder，MTLCommandBuffer始终按照创建从属渲染命令编码器的顺序包含命令，如图5-6所示。

Figure 5-6 Ordering of Render Command Encoders in a Parallel Rendering Pass

后记

本篇主要讲述了图形渲染：渲染命令编码器，感兴趣的给个赞或者关注~~~