Audio Transformations Based Explanations (ATBE) for Deep Learning Models Trained on Musical Data

Name
Chenghan Chung
Abstract
Explaining deep learning model behaviour is difficult. For computer vision models, a number of methods exist, which can highlight the areas that the network focuses on in an image. For music classification model, this does not usually result in satisfactory result as interpreting models trained on audio has to be based not on visual but on musical concepts, related to acoustic characteristics important to humans, such as pitch, tempo, melody, harmony. In this thesis we propose a concept of audio transformations based explanations (ATBE) for deep learning models for music. We release a python package called ATBE, which can explain which acoustic properties are important for predicting certain classes using error analysis on modified input, and using LIME on surrogate features created in the process of augmentation.
Graduation Thesis language
English
Graduation Thesis type
Master - Computer Science
Supervisor(s)
Anna Aljanaki
Defence year
2024
 
PDF