ADTS头包含了AAC文件的采样率、通道数、帧数据长度等信息。ADTS头分为固定头信息和可变头信息两个部分,固定头信息在每个帧中的是一样的,可变头信息在各个帧中并不是固定值。ADTS头一般是7个字节((28+28)/ 8)长度,如果需要对数据进行CRC校验,则会有2个Byte的校验码,所以ADTS头的实际长度是7个字节或9个字节。
ADTS头的固定头信息在每个帧中都是一样的。
https://wiki.multimedia.cx/index.php?title=ADTS
AAC(Advanced Audio Coding),中文名:高级 音频 编码 ,出现于1997年,基于 MPEG-2 的音频编码技术。由Fraunhofer IIS、 杜比实验室 、 AT&T 、 Sony 等公司共同开发,目的是取代 MP3 格式。2000年, MPEG-4 标准出现后,AAC重新集成了其特性,加入了SBR技术和PS技术,为了区别于传统的MPEG-2 AAC又称为MPEG-4 AAC。iOS平台支持AAC编码器,主要使用AudioToolbox中的AudioConverter API。之所以做AAC编码器是因为在做一个HLS的功能,HLS要求的TS文件,需要视频采用H264编码,音频采用AAC编码。H264可以使用硬件或软件编码器,前面已经介绍。AAC也可以使用硬件或者软件编码,iOS全都支持。
首先需要创建一个Converter,也就是一个AAC Encoder,使用如下接口:
extern OSStatus
AudioConverterNew( const AudioStreamBasicDescription* inSourceFormat,
const AudioStreamBasicDescription* inDestinationFormat,
AudioConverterRef* outAudioConverter) __OSX_AVAILABLE_STARTING(__MAC_10_1,__IPHONE_2_0)
输入参数分别是源和目的的数据格式。
在AAC编码的场景下,源格式就是采集到的PCM数据,目的格式就是AAC。
AudioStreamBasicDescription inAudioStreamBasicDescription
// FillOutASBDForLPCM()
inAudioStreamBasicDescription.mFormatID = kAudioFormatLinearPCM
inAudioStreamBasicDescription.mSampleRate = 44100
inAudioStreamBasicDescription.mBitsPerChannel = 16
inAudioStreamBasicDescription.mFramesPerPacket = 1
inAudioStreamBasicDescription.mBytesPerFrame = 2
inAudioStreamBasicDescription.mBytesPerPacket = inAudioStreamBasicDescription.mBytesPerFrame * inAudioStreamBasicDescription.mFramesPerPacket
inAudioStreamBasicDescription.mChannelsPerFrame = 1
inAudioStreamBasicDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsNonInterleaved
inAudioStreamBasicDescription.mReserved = 0
AudioStreamBasicDescription outAudioStreamBasicDescription = {0}// Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC
UInt32 size = sizeof(outAudioStreamBasicDescription)
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outAudioStreamBasicDescription)
OSStatus status = AudioConverterNew(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, &_audioConverter)
if(status != 0) {NSLog(@"setup converter failed: %d", (int)status)}
这样就创建了AAC编码器,默认情况下,Apple会创建一个硬件编码器,如果硬件不可用,会创建软件编码器。
经过我的测试,硬件AAC编码器的编码时延很高,需要buffer大约2秒的数据才会开始编码。而软件编码器的编码时延就是正常的,只要喂给1024个样点,就会开始编码。
那么如何在创建的时候指定使用软件编码器呢?需要用到下面的接口:
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc
UInt32 encoderSpecifier = type
OSStatus st
UInt32 size
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size)
if (st) {
NSLog(@"error getting audio format propery info: %d", (int)(st))
return nil
}
unsigned int count = size / sizeof(AudioClassDescription)
AudioClassDescription descriptions[count]
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions)
if (st) {
NSLog(@"error getting audio format propery: %d", (int)(st))
return nil
}
for (unsigned int i = 0i <counti++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc))
return &desc
}
}
return nil
}
AudioClassDescription *desc = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC
fromManufacturer:kAppleSoftwareAudioCodecManufacturer]
OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, desc, &_audioConverter)
如果要正确的编码,编码码率参数是必须设置的。否则编码时会返回560226676错误码(!dat)。
UInt32 ulBitRate = 64000
UInt32 ulSize = sizeof(ulBitRate)
status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, ulSize, &ulBitRate)
需要注意,AAC并不是随便的码率都可以支持。比如如果PCM采样率是44100KHz,那么码率可以设置64000bps,如果是16K,可以设置为32000bps。
创建完成Converter和设置完Bitrate之后,可以查询一下最大编码输出的大小,后续会用到。
UInt32 value = 0
size = sizeof(value)
AudioConverterGetProperty(_audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &value)
获取出来的Value表示编码器最大输出的包大小。
然后调用AudioConverterFillCOmplexBuffer进行编码:
AudioBufferList outAudioBufferList = {0}
outAudioBufferList.mNumberBuffers = 1
outAudioBufferList.mBuffers[0].mNumberChannels = 1
outAudioBufferList.mBuffers[0].mDataByteSize = value//value是上面查询到的值
outAudioBufferList.mBuffers[0].mData = new int8[value]
UInt32 ioOutputDataPacketSize = 1
status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, NULL)
编码接口中,inInputDataProc是一个输入数据的回调函数。用来喂PCM数据给Converter,ioOutputDataPacketSize为1表示编码产生1帧数据即返回。outAudioBufferList用来存放编码后的数据。
inInputDataProc中的处理如下:
static OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AACEncoder *encoder = (__bridge AACEncoder *)(inUserData)
UInt32 requestedPackets = *ioNumberDataPackets
uint8_t *buffer
uint32_t bufferLength = requestedPackets * 2
uint32_t bufferRead
bufferRead = [encoder.pcmPool readBuffer:&buffer withLength:bufferLength]
if (bufferRead == 0) {
*ioNumberDataPackets = 0
return -1
}
ioData->mBuffers[0].mData = buffer
ioData->mBuffers[0].mDataByteSize = bufferRead
ioData->mNumberBuffers = 1
ioData->mBuffers[0].mNumberChannels = 1
*ioNumberDataPackets = bufferRead >>1
return noErr
}
pcmPool是一个用于存放PCM数据的环形缓冲区。
因为采集输入每次不一定有1024样点,所以可以将数据缓存起来,再满足1024样点时再调用编码。
另外,对于TS文件来说,每个AAC数据需要增加一个adts头,adts头是一个7bit的数据,通过adts可以得知AAC数据的编码参数,方便解码器进行解码。
adts头的计算方法如下:
- (NSData*) adtsDataForPacketLength:(NSUInteger)packetLength {
int adtsLength = 7
char *packet = (char *)malloc(sizeof(char) * adtsLength)
// Variables Recycled by addADTStoPacket
int profile = 2 //AAC LC
//39=MediaCodecInfo.CodecProfileLevel.AACObjectELD
int freqIdx = 8 //16KHz
int chanCfg = 1 //MPEG-4 Audio Channel Configuration. 1 Channel front-center
NSUInteger fullLength = adtsLength + packetLength
// fill in ADTS data
packet[0] = (char)0xFF// 11111111 = syncword
packet[1] = (char)0xF9// 1111 1 00 1 = syncword MPEG-2 Layer CRC
packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2))
packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11))
packet[4] = (char)((fullLength&0x7FF) >>3)
packet[5] = (char)(((fullLength&7)<<5) + 0x1F)
packet[6] = (char)0xFC
NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES]
return data
}
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)