RTMP （五）摄像头数据处理

2020-11-03 本文已影响0人 zcwfeng

RTMP（一）录屏直播理论入门
 RTMP（二）搭建推流服务
 RTMP （三）音视频采集与数据封包
 RTMP（四）交叉编译与CameraX
RTMP （五）摄像头数据处理
 RTMP （六）音视频编码推流

前面的文章提过，使用Android进行摄像头直播，流程

RTMP直播实现流程.png

就图像而言，首先需要获得摄像头采集的数据，然后得到这个byte[] 进行编码，再进行后续的封包与发送。我们通过CameraX图像分析接口得到的数据为ImageProxy(Image的代理类)。那么怎么从ImageProxy/Image 中获取我们需要的数据呢，这个数据格式是什么?

ImageProxy/Image

Image是android SDK提供的一个完整的图像缓冲区，图像数据为:YUV或者RGB等格式。在编码时，一般编码器接收的待编码数据格式为 I420。而ImageProxy则是CameraX中定义的一个接口，Image的所有方法，也都能够从 ImageProxy调用。

@Override
    public void analyze(ImageProxy image, int rotationDegrees) {
        
        int width = image.getWidth();
        int height = image.getHeight();
        // 格式 YUV/RGB..
        int format = image.getFormat();
        // 图像数据
        ImageProxy.PlaneProxy[] planes = image.getPlanes();
        
        
        
        
        byte[] bytes = ImageUtils.getBytes(image, rotationDegrees, rtmpClient.getWidth(), rtmpClient.getHeight());
        try {
            if (fos != null)
                fos.write(bytes);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

可以通过 getPlanes方法得到PlaneProxy数组。PlaneProxy为Image.Plane代理，同ImagePrxoy与Image的关系一样。

planes0_bytebuffer.png

其实CameraX给到我们的数据格式在官网中有提到:CameraX会生成YUV_420_888格式图片

YUV420

YUV即通过Y、U和V三个分量表示颜色空间，其中Y表示亮度，U和V表示色度。( 如果UV数据都为0，那么我们将得到一个黑白的图像)

RGB中每个像素点都有独立的R、G和B三个颜色分量值，YUV根据U和V采样数目的不同，分为如YUV444、 YUV422和YUV420等，而YUV420表示的就是每个像素点有一个独立的亮度表示，即Y分量;而色度，即U和V分量则由每4个像素点共享一个。举例来说，对于4x4的图片，在YUV420下，有16个Y值，4个U值和4个V值。

YUV420根据颜色数据的存储顺序不同，又分为了多种不同的格式，这些格式实际存储的信息还是完全一致的。举例来说，对于4x4的图片，在YUV420下，任何格式都有16个Y值，4个U值和4个V值，不同格式只是Y、U和V的排列顺序变化。I420 为 YYYYYYYYYYYYYYYYUUUUVVVV ，NV21 则为 YYYYYYYYYYYYYYYYUVUVUVUV 。也就是说，YUV420 是一类格式的集合，YUV420并不能完全确定颜色数据的存储顺序。

为了方便观看，我们给他们变个形状，便于理解

I420 : 4x4 ----> 4x4+4/2*4/2 + 4/2*4/2 = 4 * 4 * 3 / 2

Y  Y  Y  Y
Y  Y  Y  Y
Y  Y  Y  Y
Y  Y  Y  Y
U  U
U  U
V  V
V  V

NV21

Y  Y  Y  Y
Y  Y  Y  Y
Y  Y  Y  Y
Y  Y  Y  Y
U  V  U  V
U  V  U  V

PlaneProxy/Plane

Y、U和V三个分量的数据分别保存在三个 Plane类中，即通过 getPlanes()得到的数组。 Plane 实际是对
ByteBuffer的封装。

Image保证了planes[0]一定是Y，planes[1]一定是U，planes[2]一定是V。且对于plane [0]，Y分量数据一定是连续存储的，中间不会有U或V数据穿插，也就是说我们一定能够一次性得到所有Y分量的值。

但是对于UV数据，可能存在以下两种情况:

1. planes[1] = {UUUU...}，planes[2] = {VVVV...}; 
2. planes[1] = {UVUV...}，planes[2] = {VUVU...}。

所以在我么取数据时需要在根据Plane中的另一个信息来确定如何取对应的U或者V数据。

// 行内数据值间隔
// 1:表示无间隔取值，即为上面的第一种情况
// 2: 表示需要间隔一个数据取值，即为上面的第二种情况
 int pixelStride = plane.getPixelStride();

根据这个属性，我们将确定数据如何存储，因此如果需要取出代表I420格式的byte[]，则为:
YUV420中，Y数据长度为: width*height , 而U、V都为:width / 2 * height / 2。

很容易回想到

// Y数据 pixelStride一定为1
        int pixelStride = planes[0].getPixelStride();
        planes[0].getBuffer() // Y数据
        byte[] u = new byte[image.getWidth() / 2 * image.getHeight() / 2];
        int pixelStride = planes[1].getPixelStride();
        if (pixelStide == 1) {
            planes[1].getBuffer() // U数据
        } else if (pixelStide == 2) {
            ByteBuffer uBuffer = planes[1].getBuffer()
            for (int i = 0; i < uBuffer.remaining(); i+=2) {
                u[i] = uBuffer.get(); //丢弃一个数据，这个数据其实是V数据，但是我们还是到planes[2]中获取V数据 
                uBuffer.get();
            }
        }

但是如果使用上面的代码去获取I420数据，可能你会惊奇的发现，并不是在所有你设置的Width与 Height(分辨率)下都能够正常运行。我们忽略了什么，为什么会出现问题呢?
在Plane中我们已经使用了 getBuffer 与 getPixelStride 两个方法，但是还有一个 getRowStride没有用到

RowStride

RowStride表示行步长，Y数据对应的行步长可能为:

等于Width;
大于Width;
以4x4的I420为例，其数据可以看为

y4202.png

如果RowStride等于Width，那么我们直接通过 planes[0].getBuffer() 获得Y数据没有问题。

但是如果RowStride大于Width，比如对于4x4的I420，如果每行需要以8字节对齐，那么可能得到的RowStride不
等于4(Width)，而是得到8。那么此时会在每行数据末尾补充占位的无效数据:

y4201.png

对于这种情况，我们获取Y数据，则为:

//用于保存获取的I420数据。大小为:y+u+v, width*height + width/2*height/2 + width/2*height/2
        ByteBuffer i420 = ByteBuffer.allocate(image.getWidth() * image.getHeight() * 3 / 2);
        // 3个元素 0：Y，1：U，2：V
        ImageProxy.PlaneProxy[] planes = image.getPlanes();
        // byte[]

        /**
         * Y数据
         */
        //y数据的这个值只能是：1
        int pixelStride = planes[0].getPixelStride();
        ByteBuffer yBuffer = planes[0].getBuffer();
        int rowStride = planes[0].getRowStride();

// 每行要排除的无效数据，但是需要注意:实际测试中 最后一行没有这个补位数据
 // 因为Y数据 RowStride 为大于等于Width，所以不会出现负数导致错误
// RowStride 等于Width，则得到空数组，不丢弃数据
        //1、rowStride 等于Width ，那么就是一个空数组
        //2、rowStride 大于Width ，那么就是每行多出来的数据大小个byte
        byte[] skipRow = new byte[rowStride - image.getWidth()];
        byte[] row = new byte[image.getWidth()];
        for (int i = 0; i < image.getHeight(); i++) {
            yBuffer.get(row);
            i420.put(row);
            // 不是最后一行才有无效占位数据，最后一行因为后面跟着U 数据，没有无效占位数据，不需要丢弃
            if (i < image.getHeight() - 1) {
                yBuffer.get(skipRow);
            }
        }

而对于U与V数据，对应的行步长可能为:

等于Width;
大于Width;
等于Width/2;
大于Width/2

等于Width

这表示，我们获得planes[1]中不仅包含U数据，还会包含V的数据，此时pixelStride==2。

U	V	U	V
U	V	U	V

那么V数据:planes[2]，则为:

V	U	V	U
V	U	V	U

大于Width
与Y数据一样，可能由于字节对齐，出现RowStride大于Width的情况，与等于Width一样，planes[1]中不仅包含U 数据，还会包含V的数据，此时pixelStride==2。

U	V	U	V	0	0	0	0

U	V	U	V	最后一行没有站位

planes[2]，则为:

V	U	V	U	0	0	0	0

V	U	V	U	最后一行没有站位

等于Width/2
当获取的U数据对应的RowStride等于Width/2，表示我们得到的planes[1]只包含U数据。此时pixelStride==1。那么planes[1]+planes[2]为:

U	U
U	U
V	V
V	V

这种情况，所有的U数据是连在一起的，即 planes[1].getBuffer 可以直接获得完整的U数据。

大于Width/2

同样我们得到的planes[1]只包含U数据，但是与Y数据一样，可能存在占位数据。此时pixelStride==1。 planes[1]+planes[2]为:

U	U	0	0	0	0	0	0

U	U	最后一行没有站位

V	V	0	0	0	0	0	0

V	V	最后一行没有站位

综上得出

在获得了摄像头采集的数据之后，我们需要获取对应的YUV数据，需要根据pixelStride判断格式，同时还需要通过 rowStride来确定是否存在无效数据，那么最终我们获取YUV数据的完整实现为:

public static byte[] getBytes(ImageProxy image, int rotationDegrees, int width, int height) {
        //图像格式
        int format = image.getFormat();
        if (format != ImageFormat.YUV_420_888) {
            //抛出异常
        }
        //用于保存获取的I420数据。大小为:y+u+v, width*height + width/2*height/2 + width/2*height/2
        ByteBuffer i420 = ByteBuffer.allocate(image.getWidth() * image.getHeight() * 3 / 2);
        // 3个元素 0：Y，1：U，2：V
        ImageProxy.PlaneProxy[] planes = image.getPlanes();
        // byte[]

        /**
         * Y数据
         */
        //y数据的这个值只能是：1
        int pixelStride = planes[0].getPixelStride();
        ByteBuffer yBuffer = planes[0].getBuffer();
        int rowStride = planes[0].getRowStride();

        //1、rowStride 等于Width ，那么就是一个空数组
        //2、rowStride 大于Width ，那么就是每行多出来的数据大小个byte
        byte[] skipRow = new byte[rowStride - image.getWidth()];
        byte[] row = new byte[image.getWidth()];
        for (int i = 0; i < image.getHeight(); i++) {
            yBuffer.get(row);
            i420.put(row);
            // 不是最后一行才有无效占位数据，最后一行因为后面跟着U 数据，没有无效占位数据，不需要丢弃
            if (i < image.getHeight() - 1) {
                yBuffer.get(skipRow);
            }
        }

        /**
         * U、V
         */
        for (int i = 1; i < 3; i++) {
            ImageProxy.PlaneProxy plane = planes[i];
            pixelStride = plane.getPixelStride();
            rowStride = plane.getRowStride();
            ByteBuffer buffer = plane.getBuffer();

            //每次处理一行数据
            int uvWidth = image.getWidth() / 2;
            int uvHeight = image.getHeight() / 2;

            // 一次处理一个字节
            for (int j = 0; j < uvHeight; j++) {
                for (int k = 0; k < rowStride; k++) {
                    //最后一行
                    if (j == uvHeight - 1) {
                        //uv没混合在一起
                        if (pixelStride == 1) {
                            //rowStride ：大于等于Width/2
                            // 结合外面的if：
                            //  如果是最后一行，我们就不管结尾的占位数据了
                            if (k >= uvWidth) {
                                break;
                            }
                        } else if (pixelStride == 2) {
                            //uv混在了一起
                            // rowStride：大于等于 Width
                            if (k >= image.getWidth()) {
                                break;
                            }
                        }
                    }


                    byte b = buffer.get();
                    // uv没有混合在一起
                    if (pixelStride == 1) {
                        if (k < uvWidth) {
                            i420.put(b);
                        }
                    } else if (pixelStride == 2) {
                        // uv混合在一起了
                        //1、偶数位下标的数据是我们本次要获得的U/V数据
                        //2、占位无效数据要丢弃，不保存
                        if (k < image.getWidth() && k % 2 == 0) {
                            i420.put(b);
                        }
                    }
                }
            }
        }

。。。
        return result;
    }

旋转与缩放

YUV数据旋转

yuv旋转.png

分别对Y、U、V进行旋转即可。无论是旋转还是缩放我们都能借助一些开源实现来完成，如OpenCV、Libyuv等。这里我们选择使用Libyuv，它更加的轻量级同时也是专门处理各种图像数据的格式转换、缩放与旋转等的Google开源的C++图像处理库.

对于CameraX获得的图像数据，我们从ImageProxy中获得I420之后，还需要进行旋转。需要旋转的角度在回调中已经告知我们 public void analyze(ImageProxy image, int rotationDegrees) 。

同时作为直播推流器的开发，使用者可以配置各种分辨率，不一定符合CameraX得到的分辨率。所以我们在对图像旋转后再对他进行缩放至使用者配置的推流分辨率大小。

。。。
int srcWidth = image.getWidth();
        int srcHeight = image.getHeight();
        //I420
        byte[] result = yuv420.array();

        if (rotationDegrees == 90 || rotationDegrees == 270) {
            //旋转之后 ，图像宽高交换
            // TODO: 2020/11/1 result 修改值，避免内存抖动
            rotation(result, image.getWidth(), image.getHeight(), rotationDegrees);
            srcWidth = image.getHeight();
            srcHeight = image.getWidth();
        }

        if(srcWidth != width || srcHeight != height){
            // TODO: 2020/11/1 jni对scalBytes 修改值，避免内存抖动
            int scaleSize = width * height * 3 /2;
            if(scaleBytes == null || scaleBytes.length < scaleSize) {
                scaleBytes = new byte[scaleSize];
            }
            scale(result,scaleBytes,srcWidth,srcHeight,width,height);
            return scaleBytes;
        }


。。。
private static native void rotation(byte[] data, int width, int height, int degress);

    private static native void scale(byte[] src,byte[] dst,int srcWidth,int srcHeight,int dstWidth,int dstHeight);

首先从官网下载Libyuv源码 :https://chromium.googlesource.com/libyuv/libyuv 。并按照之前学习的内容，可以选择将其整个源码放入AS中一起编译。
对应的旋转与缩放实现为:

#include <jni.h>
#include <libyuv.h>

extern "C"
JNIEXPORT void

JNICALL
Java_top_zcwfeng_pusher_ImageUtils_rotation(JNIEnv *env, jclass clazz, jbyteArray data_,
                                            jint width, jint height, jint degress) {

    jbyte *data = env->GetByteArrayElements(data_, 0);
    uint8_t *src = reinterpret_cast<uint8_t *>(data);
    int ySize = width * height;
    int uSize = (width >> 1) * (height >> 1);
    int size = (ySize * 3) >> 1;
    uint8_t *src_y = src;
    uint8_t *src_u = src + ySize;
    uint8_t *src_v = src + ySize + uSize;

    uint8_t dst[size];
    uint8_t *dst_y = dst;
    uint8_t *dst_u = dst + ySize;
    uint8_t *dst_v = dst + ySize + uSize;
    libyuv::I420Rotate(src_y, width, src_u, width >> 1, src_v, width >> 1,
                       dst_y, height, dst_u, height >> 1, dst_v, height >> 1,
                       width, height, static_cast<libyuv::RotationMode>(degress));


    jbyteArray result = env->NewByteArray(size);
    env->SetByteArrayRegion(result, 0, size, reinterpret_cast<const jbyte *>(dst));

    env->ReleaseByteArrayElements(data_, data, 0);
    env->SetByteArrayRegion(data_, 0, size, reinterpret_cast<const jbyte *>(dst));

}
extern "C"
JNIEXPORT void JNICALL
Java_top_zcwfeng_pusher_ImageUtils_scale(JNIEnv *env, jclass clazz, jbyteArray src_,
                                         jbyteArray dst_,
                                         jint src_width, jint src_height, jint dst_width,
                                         jint dst_height) {
    jbyte *data = env->GetByteArrayElements(src_, 0);
    uint8_t *src = reinterpret_cast<uint8_t *>(data);
    int64_t size = (dst_width * dst_height * 3) >> 1;
    uint8_t dst[size];


    uint8_t *src_y;
    int src_stride_y;
    uint8_t *src_u;
    int src_stride_u;
    uint8_t *src_v;
    int src_stride_v;

    uint8_t *dst_y;
    int dst_stride_y;
    uint8_t *dst_u;
    int dst_stride_u;
    uint8_t *dst_v;
    int dst_stride_v;

    src_stride_y = src_width;
    src_stride_u = src_width >> 1;
    src_stride_v = src_stride_u;

    dst_stride_y = dst_width;
    dst_stride_u = dst_width >> 1;
    dst_stride_v = dst_stride_u;


    int src_y_size = src_width * src_height;
    int src_u_size = src_stride_u * (src_height >> 1);
    src_y = src;
    src_u = src + src_y_size;
    src_v = src + src_y_size + src_u_size;

    int dst_y_size = dst_width * dst_height;
    int dst_u_size = dst_stride_u * (dst_height >> 1);
    dst_y = dst;
    dst_u = dst + dst_y_size;
    dst_v = dst + dst_y_size + dst_u_size;


    libyuv::I420Scale(src_y, src_stride_y,
                      src_u, src_stride_u,
                      src_v, src_stride_v,
                      src_width, src_height,
                      dst_y, dst_stride_y,
                      dst_u, dst_stride_u,
                      dst_v, dst_stride_v,
                      dst_width, dst_height,
                      libyuv::FilterMode::kFilterNone
    );


    env->ReleaseByteArrayElements(src_, data, 0);
    env->SetByteArrayRegion(dst_, 0, size, reinterpret_cast<const jbyte *>(dst));
}