AVFoundation-05音频捕捉编码
概述
AVFoundation 是一个可以用来使用和创建基于时间的视听媒体数据的框架。AVFoundation 的构建考虑到了目前的硬件环境和应用程序,其设计过程高度依赖多线程机制。充分利用了多核硬件的优势并大量使用block和GCD机制,将复杂的计算机进程放到了后台线程运行。会自动提供硬件加速操作,确保在大部分设备上应用程序能以最佳性能运行。该框架就是针对64位处理器设计的,可以发挥64位处理器的所有优势。
![](https://img.haomeiwen.com/i1860319/547168407c6825bd.png)
捕捉会话
AV Foundation 捕捉栈的核心类是AVCaptureSession。一个捕捉会话相当于一个虚拟的插线板,用于连接输入和输出的资源。捕捉会话管理从物理设备得到的数据流。
self.captureSession = [[AVCaptureSession alloc] init];
捕捉设备
AVCaptureDevice为诸如摄像头或麦克风等物理设备等提供了一个接口。最常用的是+ (nullable AVCaptureDevice *)defaultDeviceWithMediaType:(AVMediaType)mediaType;,根据指定的媒体类型返回系统默认的设备。
// Input
AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
NSError *error;
AVCaptureInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];
if (!error && [self.captureSession canAddInput:input]) {
[self.captureSession addInput:input];
}
// Output
self.audioDataOutput = [[AVCaptureAudioDataOutput alloc] init];
[self.audioDataOutput setSampleBufferDelegate:self queue:dispatch_get_global_queue(0, 0)];
if ([self.captureSession canAddOutput:self.audioDataOutput]) {
[self.captureSession addOutput:self.audioDataOutput];
}
捕捉输入输出
在使用捕捉设备进行处理前,首先要添加输入、输出设备。通常情况下AVCaptureDevice不能直接添加到AVCaptureSession中,我们需要使用AVCaptureDeviceInput。AVCaptureOutput是一个抽象基类,用于将捕捉到的数据输出。框架定义了AVCaptureOutput的一些扩展,具体如下所示:
类型 | 说明 |
---|---|
AVCaptureStillImageOutput | 图像输出,iOS10.0版本废弃,使用AVCapturePhotoOutput替代 |
AVCapturePhotoOutput | 图像输出,iOS10.0版本引入 |
AVCaptureVideoDataOutput | 视频输出 |
AVCaptureAudioDataOutput | 音频输出 |
AVCaptureFileOutput | 音视频文件输出,子类 AVCaptureMovieFileOutput、AVCaptureAudioFileOutput |
AVCaptureMetadataOutput | 元数据输出 |
ADTS
ADTS全称是(Audio Data Transport Stream),是AAC的一种十分常见的传输格式。一般情况下ADTS的头信息都是7个字节。ADTS 头中包含许多有用的信息如:采样率、声道数、帧长度。熟悉ADTS的格式后,很容易就能把AAC打包成ADTS。我们只需得到封装格式里面关于音频采样率、声道数、元数据长度、AAC格式类型等信息,最后在每个AAC原始流前面加上ADTS头就可以了。ADTS每个比特位说明参见: http://wiki.multimedia.cx/index.php?title=ADTS
ADTS 示例:
AAAAAAAA AAAABCCD EEFFFFGH HHIJKLMM MMMMMMMM MMMOOOOO OOOOOOPP (QQQQQQQQ QQQQQQQQ)
字幕 | 长度 (bits) | 描述 |
---|---|---|
A | 12 | syncword 0xFFF, all bits must be 1 |
B | 1 | MPEG Version: 0 for MPEG-4, 1 for MPEG-2 |
C | 2 | Layer: always 0 |
D | 1 | protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC |
E | 2 | profile, the MPEG-4 Audio Object Type minus 1 |
F | 4 | MPEG-4 Sampling Frequency Index (15 is forbidden) |
G | 1 | private bit, guaranteed never to be used by MPEG, set to 0 when encoding, ignore when decoding |
H | 3 | MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband PCE) |
I | 1 | originality, set to 0 when encoding, ignore when decoding |
J | 1 | home, set to 0 when encoding, ignore when decoding |
K | 1 | copyrighted id bit, the next bit of a centrally registered copyright identifier, set to 0 when encoding, ignore when decoding |
L | 1 | copyright id start, signals that this frame's copyright id bit is the first bit of the copyright id, set to 0 when encoding, ignore when decoding |
M | 13 | frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame) |
O | 11 | Buffer fullness |
P | 2 | Number of AAC frames (RDBs) in ADTS frame minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame |
Q | 16 | CRC if protection absent is 0 |
给编码的音频数据添加ADTS,具体字节位可以参照上表:
- (NSData *)adtsDataForPacketLength:(NSUInteger)packetLength
{
int adtsLength = 7;
char *packet = malloc(sizeof(char) * adtsLength);
// Variables Recycled by addADTStoPacket
int profile = 2; //AAC LC
//39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
int freqIdx = 4; //44.1KHz
int chanCfg = 1; //MPEG-4 Audio Channel Configuration. 1 Channel front-center
NSUInteger fullLength = adtsLength + packetLength;
// fill in ADTS data
packet[0] = (char)0xFF; // 11111111 = syncword
packet[1] = (char)0xF9; // 1111 1 00 1 = syncword MPEG-2 Layer CRC
packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
packet[4] = (char)((fullLength&0x7FF) >> 3);
packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
packet[6] = (char)0xFC;
NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
return data;
}
硬编码
硬编码是系统提供的,由系统专门嵌入的硬件设备处理音频编码,主要计算操作在对应的硬件中。硬编码的特点是,速度快,CPU占用少,但是不够灵活,只能使用一些特定的功能。
- 打开编码器。
- (void)setupWithSampleRate:(float)sampleRate
bitsPerChannel:(int)bitsPerChannel
channelCount:(int)channelCount
bitrate:(int)bitrate
{
_audioQueue = dispatch_queue_create("com.qm.audio.queue", NULL);
self.sampleRate = sampleRate;
self.sampleSize = bitsPerChannel;
self.channelCount = channelCount;
self.bitrate = bitrate;
//创建audio encode converter也就是AAC编码器
//初始化一系列参数
AudioStreamBasicDescription inputAudioDes = {
.mFormatID = kAudioFormatLinearPCM,
.mSampleRate = self.sampleRate,
.mBitsPerChannel = self.sampleSize,
.mFramesPerPacket = 1,//每个包1帧
.mBytesPerFrame = 2,//每帧2字节
.mBytesPerPacket = 2,//每个包1帧也是2字节
.mChannelsPerFrame = self.channelCount,//声道数,推流一般使用单声道
//下面这个flags的设置参照此文:http://www.mamicode.com/info-detail-986202.html
.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsNonInterleaved,
.mReserved = 0
};
//设置输出格式,声道数
AudioStreamBasicDescription outputAudioDes = {
.mChannelsPerFrame = self.channelCount,
.mFormatID = kAudioFormatMPEG4AAC,
0
};
//初始化_aConverter
UInt32 outDesSize = sizeof(outputAudioDes);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&outDesSize,
&outputAudioDes);
OSStatus status = AudioConverterNew(&inputAudioDes, &outputAudioDes, &_outAudioConverter);
if (status != noErr) {
NSLog(@"%@", @"硬编码AAC创建失败");
}
//设置码率
UInt32 aBitrate = self.bitrate;
UInt32 aBitrateSize = sizeof(aBitrate);
status = AudioConverterSetProperty(_outAudioConverter,
kAudioConverterEncodeBitRate,
aBitrateSize,
&aBitrate);
//查询最大输出
UInt32 aMaxOutput = 0;
UInt32 aMaxOutputSize = sizeof(aMaxOutput);
AudioConverterGetProperty(_outAudioConverter,
kAudioConverterPropertyMaximumOutputPacketSize,
&aMaxOutputSize,
&aMaxOutput);
self.aMaxOutputFrameSize = aMaxOutput;
if (aMaxOutput == 0) {
NSLog(@"%@", @"AAC 获取最大frame size失败");
}
}
- 编码LPCM音频数据。
- (void)encodePCMData:(NSData *)pcmData
{
dispatch_async(_audioQueue, ^{
self.curFramePcmData = pcmData;
//构造输出结构体,编码器需要
AudioBufferList outAudioBufferList = {0};
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = (uint32_t)self.channelCount;
outAudioBufferList.mBuffers[0].mDataByteSize = self.aMaxOutputFrameSize;
outAudioBufferList.mBuffers[0].mData = malloc(self.aMaxOutputFrameSize);
UInt32 outputDataPacketSize = 1;
//执行编码,此处需要传一个回调函数aacEncodeInputDataProc,以同步的方式,在回调中填充pcm数据。
OSStatus status = AudioConverterFillComplexBuffer(_outAudioConverter,
aacEncodeInputDataProc,
(__bridge void * _Nullable)(self),
&outputDataPacketSize,
&outAudioBufferList,
NULL);
if (status == noErr) {
//编码成功,获取数据
NSData *rawAAC = [NSData dataWithBytes: outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
NSData *adtsData = [self adtsDataForPacketLength:rawAAC.length];
// AAC完整范围
NSMutableData *resultData = [NSMutableData dataWithBytes:adtsData.bytes length:adtsData.length];
[resultData appendBytes:rawAAC.bytes length:rawAAC.length];
// CallBack
[_delegate didGetEncodedData:resultData error:nil];
//时间戳(ms) = 1000 * 每秒采样数 / 采样率;
// self.timestamp += 1024 * 1000 / self.sampleRate;
//获取到aac数据,转成flv audio tag,发送给服务端。
}else{
//编码错误
NSLog(@"%@", @"aac 编码错误");
}
});
}
//回调函数,系统指定格式
static OSStatus aacEncodeInputDataProc(AudioConverterRef inAudioConverter,
UInt32 *ioNumberDataPackets,
AudioBufferList *ioData,
AudioStreamPacketDescription **outDataPacketDescription,
void *inUserData)
{
HwAACEncoder *hwAacEncoder = (__bridge HwAACEncoder *)inUserData;
//将pcm数据交给编码器
if (hwAacEncoder.curFramePcmData) {
ioData->mBuffers[0].mData = (void *)hwAacEncoder.curFramePcmData.bytes;
ioData->mBuffers[0].mDataByteSize = (uint32_t)hwAacEncoder.curFramePcmData.length;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0].mNumberChannels = (uint32_t)hwAacEncoder.channelCount;
return noErr;
}
return -1;
}
- 释放资源。
- (void)destroy
{
AudioConverterDispose(_outAudioConverter);
_outAudioConverter = nil;
self.curFramePcmData = nil;
self.aMaxOutputFrameSize = 0;
}
软编码
软编码是指通过软件程序进行数据编码,主要计算操作在CPU中。软编码的特点是,灵活,多样,功能丰富可扩展,但是CPU占用较多。在编码LPCM数据的时候,使用的是 FAAC http://www.audiocoding.com/index.html 库。
- 打开编码器。
- (void)setupWithSampleRate:(int)sampleRate
numChannels:(int)numChannels
pcmBitSize:(int)pcmBitSize
{
_maxOutputBytes = 0;
_aacHandle = faacEncOpen(sampleRate, numChannels, &_inputSamples, &_maxOutputBytes);
if (_aacHandle) {
faacEncConfigurationPtr config = faacEncGetCurrentConfiguration(_aacHandle);
config->bitRate = 100000;
_pcmBitSize = pcmBitSize;
switch (_pcmBitSize) {
case 16:
config->inputFormat = FAAC_INPUT_16BIT;
break;
case 24:
config->inputFormat = FAAC_INPUT_24BIT;
break;
case 32:
config->inputFormat = FAAC_INPUT_32BIT;
break;
default:
config->inputFormat = FAAC_INPUT_FLOAT;
break;
}
config->aacObjectType = MAIN;
config->mpegVersion = MPEG2;
config->outputFormat = 0;
config->useTns = 1;
config->allowMidside = 0;
faacEncSetConfiguration(_aacHandle, config);
_maxInputBytes = _inputSamples * _pcmBitSize / 8;
_outputBuffer = malloc(sizeof(char) * _maxOutputBytes);
}
}
- 编码LPCM音频数据。
- (void)encodeBuffer:(char *)buffer size:(uint)samplesInput
{
memset(_outputBuffer, 0x00, _maxOutputBytes);
// 输入样本数,用实际读入字节数计算,一般只有读到文件尾时才不是nPCMBufferSize/(nPCMBitSize/8);
unsigned int bufferSize = samplesInput / (_pcmBitSize / 8);
int len = faacEncEncode(_aacHandle,
(int *)buffer,
bufferSize,
_outputBuffer,
(unsigned int)_maxOutputBytes);
if (len > 0) {
NSData *rawAAC = [NSData dataWithBytes:_outputBuffer length:len];
NSData *adtsData = [self adtsDataForPacketLength:rawAAC.length];
// AAC完整范围
NSMutableData *resultData = [NSMutableData dataWithBytes:adtsData.bytes length:adtsData.length];
[resultData appendBytes:rawAAC.bytes length:rawAAC.length];
[_delegate didGetEncodedData:resultData error:nil];
}
}
- 释放资源。
- (void)destroy
{
faacEncClose(_aacHandle);
free(_outputBuffer);
_aacHandle = NULL;
_outputBuffer = NULL;
}
使用编码器
分别使用软编码、硬编码编码LPCM音频数据,将编码的aac文件保存到沙盒的Document目录中,然后可以直接在Mac上播放。注意在保存成aac文件的时候,我们需要手动添加ADTS头信息。
//
// ViewController.m
// AVFoundation
//
// Created by mac on 17/6/20.
// Copyright © 2017年 Qinmin. All rights reserved.
//
#import "ViewController.h"
#import <AVFoundation/AVFoundation.h>
#import "HwAACEncoder.h"
#import "SwAACEncoder.h"
#define kDocumentPath(path) [[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) firstObject] stringByAppendingPathComponent:path]
@interface ViewController () <AVCaptureAudioDataOutputSampleBufferDelegate,HwAACEncoderDelegate,SwAACEncoder>
@property (nonatomic, strong) AVCaptureSession *captureSession;
@property (nonatomic, strong) HwAACEncoder *hwAACEncoder;
@property (nonatomic, assign) FILE *fileHandle;
@property (nonatomic, strong) AVCaptureAudioDataOutput *audioDataOutput;
@property (nonatomic, strong) SwAACEncoder *swAACEncoder;
@end
@implementation ViewController
- (void)viewDidLoad
{
[super viewDidLoad];
[self setupEncoder1];
[self setupFilehandle];
[self setupSession];
}
- (void)setupSession
{
self.captureSession = [[AVCaptureSession alloc] init];
// Input
AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
NSError *error;
AVCaptureInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];
if (!error && [self.captureSession canAddInput:input]) {
[self.captureSession addInput:input];
}
// Output
self.audioDataOutput = [[AVCaptureAudioDataOutput alloc] init];
[self.audioDataOutput setSampleBufferDelegate:self queue:dispatch_get_global_queue(0, 0)];
if ([self.captureSession canAddOutput:self.audioDataOutput]) {
[self.captureSession addOutput:self.audioDataOutput];
}
[self.captureSession startRunning];
}
- (void)setupEncoder
{
self.hwAACEncoder = [[HwAACEncoder alloc] init];
[self.hwAACEncoder setupWithSampleRate:44100
bitsPerChannel:16
channelCount:1
bitrate:100000];
self.hwAACEncoder.delegate = self;
}
- (void)setupEncoder1
{
self.swAACEncoder = [[SwAACEncoder alloc] init];
[self.swAACEncoder setupWithSampleRate:44100 numChannels:1 pcmBitSize:16];
self.swAACEncoder.delegate = self;
}
- (void)setupFilehandle
{
[[NSFileManager defaultManager] removeItemAtPath:kDocumentPath(@"out.aac") error:nil];
_fileHandle = fopen(kDocumentPath(@"out.aac").UTF8String, "ab+");
}
#pragma mark - AVCaptureAudioDataOutputSampleBufferDelegate
- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
if (self.captureSession.isRunning) {
if (output == _audioDataOutput) {
//获取pcm数据大小
NSInteger audioDataSize = CMSampleBufferGetTotalSampleSize(sampleBuffer);
//分配空间
int8_t *audioData = malloc(audioDataSize);
//获取CMBlockBufferRef, 这个结构里面就保存了 PCM数据
CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
//直接将数据copy至我们自己分配的内存中
CMBlockBufferCopyDataBytes(dataBuffer, 0, audioDataSize, audioData);
// 转为NSData
NSData *data = [NSData dataWithBytesNoCopy:audioData length:audioDataSize];
//[self.hwAACEncoder encodePCMData:data];
[self.swAACEncoder encodeBuffer:(char *)data.bytes size:(uint)data.length];
}
}
}
#pragma mark - HwAACEncoderDelegate
- (void)didGetEncodedData:(NSData *)data error:(NSError *)error
{
if (!error) {
fwrite(data.bytes, 1, data.length, _fileHandle);
}
}
@end
参考
AVFoundation开发秘籍:实践掌握iOS & OSX应用的视听处理技术
源码地址:AVFoundation开发 https://github.com/QinminiOS/AVFoundation