Skip to content

Conversation

@KobaKhit
Copy link

@KobaKhit KobaKhit commented Sep 9, 2020

Currently, extract_features_and_train needs a list of folder paths. It would be useful to be able to set how many files per folder to read at most. So I added max_files parameter with default 1000. Potentially randomly choosing those files would be another addition.

I tested it in a Kaggle notebook and it worked fine.

Motivation behind it was that there is a Birdcall Kaggle competition with 264 classes (folders) and ~100 files per class (folder). It took longer longer than 9 hours to train a model and the Kaggle notebook timed out. So I decided to train on smaller number of files per folder, i.e. undersample classes.

from pyAudioAnalysis import audioTrainTest as aT

# train classifier 
train_folder = '../input/birdsong-recognition/train_audio/'
mid_term_window_length = 2
mid_term_window_step = 1

# get audio file folders
class_paths = [train_folder + x for x in sorted(os.listdir(train_folder))]

model_name = "../input/bird-call-classification/svmSMtemp"
model_type = 'svm'

if not os.path.exists(model_name):
    model_name = "svmSMtemp"
    # train classifier using folders of audio files
    aT.extract_features_and_train(class_paths, 
                                  mid_term_window_length, 
                                  mid_term_window_step, 
                                  aT.shortTermWindow, 
                                  aT.shortTermStep, 
                                  model_type, 
                                  model_name, 
                                  False,
                                  max_files = 5)
Analyzing file 1 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC134874.mp3
Analyzing file 2 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135454.mp3
Analyzing file 3 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135455.mp3
Analyzing file 4 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135456.mp3
Analyzing file 5 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135457.mp3
Feature extraction complexity ratio: 28.2 x realtime
Analyzing file 1 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC133080.mp3
Analyzing file 2 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC139829.mp3
Analyzing file 3 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC139921.mp3
Analyzing file 4 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC155039.mp3
Analyzing file 5 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC166076.mp3
Feature extraction complexity ratio: 27.5 x realtime

@tyiannak
Copy link
Owner

Thanx for the PR @KobaKhit
It would be nice if
(a) default value was -1 which indicates that no max files is used in the feature extraction process
(b) random shuffling would also be parametrized (not by default set to true, as in many cases we need the feature extraction to take place in the file path order)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants