- Imports essential libraries and modules for subsequent machine learning tasks.
- Includes tools for data manipulation (
os,pandas), audio processing (torchaudio), model training (torch,transformers), and visualization (tqdm,seaborn,matplotlib). - Imports the
Wav2Vec2model from Hugging Face's Transformers library.
- Loads a TSV file containing information about audio files.
- Defines a function to load audio files using the
torchaudiolibrary. - Example usage loads an audio file for demonstration.
- Balances the dataset by selecting specific accents and resampling the data.
- Class distribution of the balanced dataset is printed.
- Explores the dataset by understanding the distribution of classes and listening to audio samples.
- Provides insights into the distribution of accents and allows for a qualitative understanding of the dataset.
- Splits the dataset into training and testing sets.
- Drops rows with missing values.
- Loads a pre-trained
Wav2Vec2model and configures it for the classification task. - Sets the number of output classes based on the selected accents.
- Examines various data-related aspects, including dataset head, shape, and the presence of missing values.
- Checks the distribution of the "locale" column for insights into potential dataset imbalances.
- Creates a distribution plot using
seabornto visualize the distribution of the "locale" column in the balanced dataset.
- Creates a distribution plot, this time for the original, unbalanced dataset (
validated_df). - Allows for a comparison of language distribution before and after balancing.
- Prints all columns in a table format for the filtered dataset.
- Provides a comprehensive overview of the dataset's structure and content.
- Defines a custom dataset class (
CustomAudioDataset) to handle audio data. - Uses the
LabelEncoderto convert string labels to numerical values, facilitating model training.
- Creates a training dataset (
train_dataset) using the definedCustomAudioDatasetclass and the loaded processor.
- Creates a DataLoader for the training dataset using a custom collate function.
- The function pads input tensors to the same size and stacks labels, preparing the data for model training.
- Splits the data again into training and testing sets without stratification.
- Resulting sets are stored in
train_dfandtest_df.
- Creates a test dataset (
test_dataset) from the testing dataframe (test_df).
- Creates a DataLoader for the test dataset using the same custom collate function.
- Configures training arguments for fine-tuning.
- Specifies the output directory, batch size, number of epochs, and saving options.
- Creates a
Trainerinstance, incorporating theWav2Vec2model, training arguments, and the training dataset. - Sets the stage for model fine-tuning.
- Initiates the training loop, iterating through epochs and batches.
- Calculates training metrics, including accuracy, precision, and F1 score.
- Executes the final training step, completing the fine-tuning of the
Wav2Vec2model.
- Evaluates the trained model on the test set.
- Collects predictions for further metric calculation and analysis.
- Calculates metrics, including accuracy, precision, and F1 score based on the model's performance on the test set.
- Generates a confusion matrix using
seabornandmatplotlib. - Visually represents the model's classification performance on the test set.