Sometimes, we should extract auido LFCCs feature for building a deep learning model. In this tutorial, we will introduce you how to use python to extract this feature.
Preliminary
We can use python spafe to extract LFCCs. We should intall it first.
pip install -i https://mirrors.aliyun.com/pypi/simple/ spafe --trusted-host mirrors.aliyun.com
How to use python spafe to extract LFCCs?
First, we can import some libraries.
from spafe.features.lfcc import lfcc import spafe.utils.vis as vis import librosa
In this tutorial, we will use librosa to read audio data. However, you also can use scipy.io.wavfile.read() to read audio data.
Then we can use code below to extract LFCCs.
wav_file = r'pop.00084.wav' data, sr = librosa.load(wav_file, sr = 22050, mono=True) print(data.shape) lfccs = lfcc(data, fs = sr, nfilts = 128, num_ceps = 60) print(lfccs.shape) print(lfccs) vis.show_features(lfccs, "LFCCs",'LMFCC Coefficient Index','Frame Index')
Run this code, we will get:
The same of LFCCs is (3005, 60), which means the LFCCs feature is 60 dimension.
Meanwhile,we also can use scipy.io.wavfile.read() to read audio data to extract LFCCs.
For example:
import scipy.io.wavfile fs, data = scipy.io.wavfile.read(wav_file) print(data.shape) lfccs = lfcc(data, fs = fs, nfilts = 128, num_ceps = 60) print(lfccs.shape, fs) print(lfccs) vis.show_features(lfccs, "LFCCs",'LMFCC Coefficient Index','Frame Index')
We cand the shape of LFCCs is the same (3005, 60) by using librosa.load() and scipy.io.wavfile.read(). However, the value is a litter different.
For example: