Machine Learning for Audio Classification
Machine Learning is an AI technique that teaches computers to learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. In this blog, you learn about machine learning for audio classification.
Project Overview
A Client from medical science works closely with patients having speaking disorders of different types. He was looking to classify different types of speaking disorders in people through machine learning.
Audio features such as MFCCs have been extensively used to differentiate between people’s voices. Proglabhelper trained a model based on Algorithms like k-nearest neighbors & Supported Vector Machine. The bag-of-words approach was used to represent the audio features.
Audio Features Extraction for Audio Classification
Audio files of people with speaking disorders were imported and stored in the audio dataset on Matlab. Features like MFCCs, Pitch, Energy Entropy, ZCR, Rolloff, Spectralcentroid, and Energy were extracted for each audio file and stored in structures.
AlgorithmMinds programmed automated scripts to account for any changes in audio data entities, sections, or disorder types, for future research.
Machine Learning Algorithms for Audio Classification
k-nearest neighbors & Supported Vector Machine
We split data with a ratio of 80/20, i.e 80 % for training and 20 % for testing. k-nearest neighbors & Supported Vector Machine algorithms are used to train models.
These algorithms adaptively improve their performance as the number of samples available for learning increases. Each classifier was cross-validated and its accuracy was measured.
Bag of Audio Features: Xbow Java Integration
A Java toolkit XBOW made by Maximilian Schmitt was used to generate Bag-of-Audio (BOW) features. BOW reduces the computational time required to train classifiers by reducing the data size of audio features.
As this toolkit required Java interference for execution, we programmed automated, user-friendly scripts. These scripts allowed users to directly execute the Java toolkit through Matlab, saving them time and frustration.