Compute Shader 功能测试

2018-12-20  本文已影响0人  上午八点

Compute Shader 可以在通常的渲染管线之外运行,执行一些大量的通用计算(GPGPU algorithms),因此可以联想到把一些大量相互之间没有关联的计算转移到GPU中进行,以减轻CPU的工作量。

Compute Shader 实例

#pragma kernel FillWithRed

RWTexture2D<float4> res;

[numthreads(1,1,1)]
void FillWithRed (uint3 dtid : SV_DispatchThreadID)
{
    res[dtid.xy] = float4(1,0,0,1);
}

以上是一个简单的Compute Shader,大概解释一下就是,

Compute Shader 的使用

计算结果保存到Texture中

Compute Shader 大概能做的事情已经很清晰了,现在就来实际试用下,先从简单一点的开始,刚才实例里的shader只是给所有像素存储了同一个float4数值,并没有进行什么计算,这样并不符合Compute Shader的名号,所以这里加一点简单的计算,实现在Compute Shader中给一个贴图设置颜色,然后在C#中把这张图设置到一个Cube上。

效果如图:


点击GetResult,使用ComputeShader计算结果图像作为主贴图显示

C# 部分:

m_rt = new RenderTexture(Width, Height, 0, RenderTextureFormat.ARGB32);
m_rt.enableRandomWrite = true;
m_rt.Create();
computeShader.SetTexture(kernelIndex, "ResultTex", m_rt);
// 在Shader中需要用到X维和Y维的数据作为坐标去读取和设置Texture2D的像素,因此需要给X维和Y维的thread group设置数值,Z维的thread group数量为1即可 //
computeShader.Dispatch(kernelIndex, 32, 32, 1);

Shader 部分:

#pragma kernel CSMain_Texture
RWTexture2D<float4> ResultTex;
[numthreads(32,32,1)]
void CSMain_Texture (uint3 id : SV_DispatchThreadID)
{
    float r = (id.x > 256 && id.x < 768 && id.y > 256 && id.y < 768) ? 1 : 0;
    float b = 1 - r;
    ResultTex[id.xy] = float4(r, 0, b, 1);
}

以上代码中需要注意的地方有:

计算结果保存到Buffer中

相比于把计算结果保存到一张Texture中,可能把计算结果保存到一个Buffer中会更灵活些,因为可以在Buffer中存储你自定义的结构体(struct),操作是这样:

We also need to define this data type inside our shader, but HLSL doesn’t have a Matrix4x4 or Vector3 type. However, it does have data types which map to the same memory layout.

说完了大致流程下面开始具体实现一下,这个测试要实现这样一个功能:定义100个物体,在C#端构造好100个matrix4x4矩阵(包含位置和缩放),然后传给Compute Shader,在Compute Shader中完成矩阵和向量的计算,然后在C#端获取计算结果,把位置和缩放设置给100个物体。

效果图:


点击GerResult后设置100个物体的位置和缩放

Sounds cool hum? Let's do this.

C# 主要代码:

// 初始化m_dataArr //
InitDataArr();

m_comBuffer = new ComputeBuffer(m_dataArr.Length, sizeof(float) * Stride);
m_comBuffer.SetData(m_dataArr);
computeShader.SetBuffer(kernelIndex, "ResultBuffer", m_comBuffer);

// 在Shader中只需要用到X维的数据作为数组索引,因此只需要给X维的thread group设置数值,Y维和Z维的thread group数量为1即可 //
computeShader.Dispatch(kernelIndex, 32, 1, 1);

// 初始化传给GPU的数据 //
void InitDataArr()
{
    if (m_dataArr == null)
    {
        m_dataArr = new DataStruct[MaxObjectNum];
    }

    const int PosRange = 10;
    for (int i = 0; i < MaxObjectNum; i++)
    {
        m_dataArr[i].pos = new Vector4(0, 0, 0, 1);
        m_dataArr[i].scale = Vector3.one;

        Matrix4x4 matrix = Matrix4x4.identity;

        // 位移信息 //
        matrix.m03 = (Random.value * 2 - 1) * PosRange;
        matrix.m13 = (Random.value * 2 - 1) * PosRange;
        matrix.m23 = (Random.value * 2 - 1) * PosRange;

        // 缩放信息 //
        matrix.m11 = Random.value * 2 + 1;              // 从[0,1]映射到[1,3] //
        matrix.m22 = Random.value * 2 + 1;
        matrix.m33 = Random.value * 2 + 1;

        m_dataArr[i].matrix = matrix;
    }
}

Shader 代码:

#pragma kernel CSMain_Buffer
// Create a RenderTexture with enableRandomWrite flag and set it
// with cs.SetTexture
RWTexture2D<float4> ResultTex;
struct Data
{
    float4 pos;
    float3 scale;
    float4x4 matrix_M;
};
[numthreads(16,1,1)]
void CSMain_Buffer (uint3 id : SV_DispatchThreadID)
{
    ResultBuffer[id.x].pos = mul(ResultBuffer[id.x].matrix_M, ResultBuffer[id.x].pos);
    ResultBuffer[id.x].scale = mul((float3x3)ResultBuffer[id.x].matrix_M,
    ResultBuffer[id.x].scale);
}

C#中线程组数量为 32,computeShader.Dispatch(kernelIndex, 32, 1, 1);,Shader中X线程组中线程数量是 16,[numthreads(16,1,1)],32*16 = 512,而我们只有100个物体,所以其实X组里设置4个线程就可以满足需求,4*32=128,大于100,即写成 [numthreads(4,1,1)] 也可以完成任务。

完整代码

最后把两部分结合到一起
C#部分:

using System;
using UnityEngine;
using Random = UnityEngine.Random;

public class ComputeShaderTest : MonoBehaviour
{
    public ComputeShader computeShader;
    public EMethod method;
    public Transform prefab;

    // KernelName //
    private const string KernelName_Texture = "CSMain_Texture";
    private const string KernelName_Buffer = "CSMain_Buffer";

    // 方式1要用到的变量 //
    private RenderTexture m_rt;
    private const int Width = 1024;
    private const int Height = 1024;
    private Material m_material;
    private Transform m_object;

    // 方式2要用到的变量 //
    private const int MaxObjectNum = 100;
    private ComputeBuffer m_comBuffer;
    private DataStruct[] m_dataArr;
    private Transform[] m_objArr;
    private Material[] m_materialArr;

    public enum EMethod : int
    {
        RenderTexture = 0,                              // 方式1: 使用 RenderTexture 来存储结算结果 //
        ComputerBuffer = 1,                             // 方式2: 使用 ComputeBuffer 来存储结算结果 //
    }

    struct DataStruct
    {
        public Vector4 pos;
        public Vector3 scale;
        public Matrix4x4 matrix;
    }

    private const int Stride = sizeof(float) * (4 + 3 + 16);

    void Start()
    {
        switch (method)
        {
            case EMethod.RenderTexture:
                m_object = Instantiate(prefab);
                m_object.position = Vector3.zero;
                m_object.localScale = Vector3.one*5;
                MeshRenderer render = m_object.GetComponent<MeshRenderer>();
                if (render != null)
                {
                    m_material = render.material;
                }
                break;

            case EMethod.ComputerBuffer:
                GameObject parent = new GameObject("Parent");
                parent.transform.position = Vector3.zero;
                // 初始化物体数组 //
                m_objArr = new Transform[MaxObjectNum];
                for (int i = 0; i < MaxObjectNum; i++)
                {
                    Transform obj = Instantiate(prefab);
                    obj.transform.SetParent(parent.transform);
                    obj.transform.localPosition = Vector3.zero;
                    obj.transform.localScale = Vector3.one;
                    m_objArr[i] = obj;
                }
                break;
        }

        //uint x = 0;
        //uint y = 0;
        //uint z = 0;
        //// 获取的是shader文件中的数值, 即 [numthreads(X, X, X)] 中的数值 //
        //computeShader.GetKernelThreadGroupSizes(kernelIndex, out x, out y, out z);
        //Debug.LogFormat("x = {0}, y = {1}, z = {2}", x, y, z);
    }

    void OnGUI()
    {
        if (GUI.Button(new Rect(0, 0, 200, 100), "Dispatch"))
        {
            Dispach();
        }

        if (GUI.Button(new Rect(200, 0, 200, 100), "Get Result"))
        {
            GetResult();
        }
    }

    void Dispach()
    {
        if (computeShader == null)
        {
            return;
        }

        int kernelIndex = -1;
        try
        {
            kernelIndex = computeShader.FindKernel(GetKernelName(method));
        }
        catch (Exception error)
        {
            Debug.LogFormat("Error: {0}", error.Message);
            return;
        }

        switch (method)
        {
            case EMethod.RenderTexture:
                if (m_rt != null)
                {
                    Destroy(m_rt);
                    m_rt = null;
                }

                m_rt = new RenderTexture(Width, Height, 0, RenderTextureFormat.ARGB32);
                m_rt.enableRandomWrite = true;
                m_rt.Create();
                computeShader.SetTexture(kernelIndex, "ResultTex", m_rt);
                // 在Shader中需要用到X维和Y维的数据作为坐标去读取和设置Texture2D的像素,因此需要给X维和Y维的thread group设置数值,Z维的thread group数量为1即可 //
                computeShader.Dispatch(kernelIndex, 32, 32, 1);
                break;

            case EMethod.ComputerBuffer:
                if (m_comBuffer != null)
                {
                    m_comBuffer.Release();
                }

                // 初始化m_dataArr //
                InitDataArr();

                m_comBuffer = new ComputeBuffer(m_dataArr.Length, sizeof(float) * Stride);
                m_comBuffer.SetData(m_dataArr);
                computeShader.SetBuffer(kernelIndex, "ResultBuffer", m_comBuffer);

                // 在Shader中只需要用到X维的数据作为数组索引,因此只需要给X维的thread group设置数值,Y维和Z维的thread group数量为1即可 //
                computeShader.Dispatch(kernelIndex, 32, 1, 1);
                break;
        }
    }

    void GetResult()
    {
        switch (method)
        {
            case EMethod.RenderTexture:
                //GameUtils.Instance().SaveToPng(m_rt, "test.png");
                m_material.SetTexture("_MainTex", m_rt);
                break;

            case EMethod.ComputerBuffer:
                if (m_comBuffer == null || 
                    m_objArr == null || m_objArr.Length != MaxObjectNum ||
                    m_dataArr == null || m_dataArr.Length != MaxObjectNum)
                {
                    break;
                }

                m_comBuffer.GetData(m_dataArr);

                // 根据计算结果设置物体位置 //
                for (int i = 0; i < MaxObjectNum; i++)
                {
                    m_objArr[i].localPosition = m_dataArr[i].pos;
                    m_objArr[i].localScale = m_dataArr[i].scale;
                }
                break;
        }
    }

    // 初始化传给GPU的数据 //
    void InitDataArr()
    {
        if (m_dataArr == null)
        {
            m_dataArr = new DataStruct[MaxObjectNum];
        }

        const int PosRange = 10;
        for (int i = 0; i < MaxObjectNum; i++)
        {
            m_dataArr[i].pos = new Vector4(0, 0, 0, 1);
            m_dataArr[i].scale = Vector3.one;

            Matrix4x4 matrix = Matrix4x4.identity;

            // 位移信息 //
            matrix.m03 = (Random.value * 2 - 1) * PosRange;
            matrix.m13 = (Random.value * 2 - 1) * PosRange;
            matrix.m23 = (Random.value * 2 - 1) * PosRange;

            // 缩放信息 //
            matrix.m11 = Random.value * 2 + 1;              // 从[0,1]映射到[1,3] //
            matrix.m22 = Random.value * 2 + 1;
            matrix.m33 = Random.value * 2 + 1;

            m_dataArr[i].matrix = matrix;
        }
    }

    string GetKernelName(EMethod method)
    {
        string kernelName = "";
        switch (method)
        {
            case EMethod.RenderTexture:
                kernelName = KernelName_Texture;
                break;
            case EMethod.ComputerBuffer:
                kernelName = KernelName_Buffer;
                break;
        }
        return kernelName;
    }

    void OnDisable()
    {
        if (m_comBuffer != null)
        {
            m_comBuffer.Release();
        }
    }
}

Shader 部分:

// Each #kernel tells which function to compile; you can have many kernels
#pragma kernel CSMain_Texture
#pragma kernel CSMain_Buffer

// Create a RenderTexture with enableRandomWrite flag and set it
// with cs.SetTexture
RWTexture2D<float4> ResultTex;

struct Data
{
    float4 pos;
    float3 scale;
    float4x4 matrix_M;
};

RWStructuredBuffer<Data> ResultBuffer;

[numthreads(32,32,1)]
void CSMain_Texture (uint3 id : SV_DispatchThreadID)
{
    // TODO: insert actual code here!

    // id.xy 不是纹理坐标,其范围在[width, height] 内,不是[0,1] //
    float r = (id.x > 256 && id.x < 768 && id.y > 256 && id.y < 768) ? 1 : 0;
    float b = 1 - r;
    ResultTex[id.xy] = float4(r, 0, b, 1);
    
    // ResultTex[id.xy] = float4(id.x & id.y, (id.x & 15)/15.0, (id.y & 15)/15.0, 0.0);
}

[numthreads(16,1,1)]
void CSMain_Buffer (uint3 id : SV_DispatchThreadID)
{
    ResultBuffer[id.x].pos = mul(ResultBuffer[id.x].matrix_M, ResultBuffer[id.x].pos);
    ResultBuffer[id.x].scale = mul((float3x3)ResultBuffer[id.x].matrix_M, ResultBuffer[id.x].scale);
}

参考链接:
: https://docs.unity3d.com/Manual/class-ComputeShader.html
: https://docs.unity3d.com/ScriptReference/ComputeBuffer.html
: http://kylehalladay.com/blog/tutorial/2014/06/27/Compute-Shaders-Are-Nifty.html
: http://blog.sina.com.cn/s/blog_471132920102w97k.html

上一篇 下一篇

猜你喜欢

热点阅读