Python Extract and Display Audio Linear-frequency Cepstral Coefficients (LFCCs) Feature – A Step Guide

By | September 9, 2022

Sometimes, we should extract auido LFCCs feature for building a deep learning model. In this tutorial, we will introduce you how to use python to extract this feature.

Preliminary

We can use python spafe to extract LFCCs. We should intall it first.

pip install -i https://mirrors.aliyun.com/pypi/simple/ spafe --trusted-host mirrors.aliyun.com

How to use python spafe to extract LFCCs?

First, we can import some libraries.

from spafe.features.lfcc import lfcc
import spafe.utils.vis as vis
import librosa

In this tutorial, we will use librosa to read audio data. However, you also can use scipy.io.wavfile.read() to read audio data.

Then we can use code below to extract LFCCs.

wav_file = r'pop.00084.wav'
data, sr = librosa.load(wav_file, sr = 22050, mono=True)
print(data.shape)
lfccs = lfcc(data, fs = sr, nfilts = 128, num_ceps = 60)
print(lfccs.shape)
print(lfccs)
vis.show_features(lfccs, "LFCCs",'LMFCC Coefficient Index','Frame Index')

Run this code, we will get:

Python Extract and Display Audio Linear-frequency Cepstral Coefficients (LFCCs) Feature - A Step Guide

The same of LFCCs is (3005, 60), which means the LFCCs feature is 60 dimension.

Meanwhile,we also can use scipy.io.wavfile.read() to read audio data to extract LFCCs.

For example:

import scipy.io.wavfile

fs, data = scipy.io.wavfile.read(wav_file)
print(data.shape)
lfccs = lfcc(data, fs = fs, nfilts = 128, num_ceps = 60)
print(lfccs.shape, fs)
print(lfccs)
vis.show_features(lfccs, "LFCCs",'LMFCC Coefficient Index','Frame Index')

We cand the shape of LFCCs is the same (3005, 60) by using librosa.load() and scipy.io.wavfile.read(). However, the value is a litter different.

For example:

The difference for extracting LFCCs by using librosa and scipy

Leave a Reply