音频分析语音信号处理

使用python_speech_features提取音频文件特征

2019-05-17  本文已影响0人  早上起来闹钟又丢了

1. 读取wav文件

使用scipy.io.wavfile

import scipy.io.wavfile as wav
fs, signal = wav.read(filename)

fs是wav文件的采样率,signal是wav文件的内容,filename是要读取的音频文件的路径。我们将signal绘制出来就是下图这个样子。


image.png

2. 使用python_speech_features提取特征

①MFCC:

默认提取的特征维度是13,通常的做法是将该特征进行一阶差分和二阶差分,并将结果进行合并。

from python_speech_features import *
import numpy as np
def get_mfcc(data, fs):
    wav_feature =  mfcc(data, fs)
    d_mfcc_feat = delta(wav_feature, 1)
    d_mfcc_feat2 = delta(wav_feature, 2)
    feature = np.hstack((wav_feature, d_mfcc_feat, d_mfcc_feat2))
    return feature

参数介绍:
内容来源于 金泽夕
https://www.cnblogs.com/zhuimengzhe/p/10223510.html

mfcc:

python_speech_features.base.fbank(signal, samplerate=16000, winlen=0.025, winstep=0.01, nfilt=26, nfft=512, lowfreq=0, highfreq=None, preemph=0.97, winfunc=<function >)

delta:

python_speech_features.base.delta(feat, N)

②logfbank

def get_fbank(data, fs):
    wav_feature = logfbank(data, fs)
    return wav_feature

参数介绍:

python_speech_features.base.logfbank(signal, samplerate=16000, winlen=0.025, winstep=0.01, nfilt=26, nfft=512, lowfreq=0, highfreq=None, preemph=0.97)
上一篇下一篇

猜你喜欢

热点阅读