You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/paper.md
+21-18Lines changed: 21 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,30 +74,33 @@ This software offer pre-trained models. This is an evolving feature as new datas
74
74
75
75
1. Audio type classifier to determine speech versus music: Trained SVM classifier for classifying audio into two possible classes - music, speech. This classifier was trained using MFCC, spectral and chroma features. Cross-validation confusion matrix has scores such as follows.
76
76
77
-
| music | speech
78
-
music | 48.80 | 1.20
79
-
speech | 0.60 | 49.40
77
+
|| music | speech |
78
+
| --- | --- | --- |
79
+
| music | 48.80 | 1.20 |
80
+
| speech | 0.60 | 49.40 |
80
81
81
82
2. Audio type classifier to determine speech versus music versus bird sounds: Trained SVM classifier that classifying audio into three possible classes - music, speech and birds. This classifier was trained using MFCC, spectral and chroma features.
82
83
83
-
| music | speech | birds
84
-
music | 31.53 | 0.73 | 1.07
85
-
speech | 1.00 | 32.33 | 0.00
86
-
birds | 0.00 | 0.00 | 33.33
84
+
|| music | speech | birds |
85
+
| --- | --- | --- | --- |
86
+
| music | 31.53 | 0.73 | 1.07 |
87
+
| speech | 1.00 | 32.33 | 0.00 |
88
+
| birds | 0.00 | 0.00 | 33.33 |
87
89
88
90
3. Music genre classifier using the GTZAN [@tzanetakis_essl_cook_2001] dataset: Trained on SVM classifier using GFCC, MFCC, spectral and chroma features to classify music into 10 genre classes - blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, rock.
89
91
90
-
| pop | met | dis | blu | reg | cla | rock | hip | cou | jazz
These baseline models aim to present capability of audio feature generation algorithms in extracting meaningful numeric patterns from the audio data. One can train their own classifiers using similar features and different machine learning backend for researching and exploring improvements.
0 commit comments