Skip to content

Commit c487e2b

Browse files
committed
Update paper
1 parent c91fef0 commit c487e2b

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

paper/paper.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ bibliography: paper.bib
2424

2525
# Summary
2626

27-
PyAudioProcessing is a Python based library for processing audio data, forming and extracting numerical features from audio and further building and testing machine learning models. This library allows you to extract features such as MFCC, GFCC, spectral features, chroma features and other beat based and cepstrum based features from audio to use with one's own classification backend or popular scikit-learn classifiers.
27+
PyAudioProcessing is a Python based library for processing audio data, forming and extracting numerical features from audio and further building and testing machine learning models. This library allows you to extract features such as MFCC, GFCC, spectral features, chroma features and other beat based and cepstrum based features from audio to use with one's own classification backend or popular scikit-learn classifiers. This software contributes to the available open-source software by enabling users to use Python based machine learning backend with highly researched audio features such as GFCC and others that are actively researched for many applications but are not available in Python due to primary popularity and focus in MATLAB. This software aims to aid users in addressing research efforts using GFCC and other researched audio features possible with Python.
2828

2929
# Statement of need
3030

@@ -34,7 +34,13 @@ The library lets the user extract aggregated data features calculated per audio
3434

3535
Some other popular libraries for the domain of audio processing include librosa [@mcfee2015librosa] and pyAudioAnalysis [@giannakopoulos2015pyaudioanalysis]. Librosa is a python package for music and audio analysis. It provides the building blocks necessary to create music information retrieval systems. PyAudioAnalysis is a python library for audio feature extraction, classification, segmentation and applications. It allows the user to train scikit-learn models for mfcc, spectral and chroma features.
3636

37-
PyAudioProcessing adds multiple additional features. The library includes the implementation of GFCC features converted from MATLAB code to allow users to leverage features for speech classification and speaker identification tasks in addition to MFCC and spectral features that are useful for music and other audio classification tasks. It allows the user to choose from the different feature options and use single or combinations of different audio features. The features can be run through a variety of scikit-learn models including a grid search for best model and Hyperparameters, along with a final confusion matrix and cross validation performance statistics. It further allows for saving and exporting the different audio features per audio file for the user to be able to leverage those while using a different custom classifier backend that is not a part of scikit-learn's models.
37+
PyAudioProcessing adds multiple additional features. The library includes the implementation of GFCC features converted from MATLAB based research to allow users to leverage Python with features for speech classification and speaker identification tasks in addition to MFCC and spectral features that are useful for music and other audio classification tasks. It allows the user to choose from the different feature options and use single or combinations of different audio features. The features can be run through a variety of scikit-learn models including a grid search for best model and Hyperparameters, along with a final confusion matrix and cross validation performance statistics. It further allows for saving and exporting the different audio features per audio file for the user to be able to leverage those while using a different custom classifier backend that is not a part of scikit-learn's models.
38+
39+
The library further provides some pre-build audio classification models such as `speechVSmusic` classifier and `music genre` classifier for give the users a baseline of pre-trained models for their common audio classification tasks. The user can use the library to build custom classifiers with the help of the instructions in the README.
40+
41+
Given the use of this software in the community today inspires the need and growth of this software. It is referenced in a text book titled `Artificial Intelligence with Python Cookbook` published by Packt Publishing in October 2020 [@packt]. Additionally, pyAudioProcessing is a part of specific admissions requirement for a funded PhD project at University of Portsmouth <sup id="portsmouth">[1](#footnote_portsmouth)</sup>. It is further referenced in this thesis paer titled "Master Thesis AI Methodologies for Processing Acoustic Signals AI Usage for Processing Acoustic Signals" [@phdthesis].
42+
43+
<b id="footnote_portsmouth">1</b> https://www.port.ac.uk/study/postgraduate-research/research-degrees/phd/explore-our-projects/detection-of-emotional-states-from-speech-and-text [](#portsmouth)
3844

3945
# Audio features
4046

0 commit comments

Comments
 (0)