A Comparative Evaluation of Explainability Techniques for Image Data

Name
Mykyta Skliarov
Abstract
The use of machine learning has increased dramatically in the last decade across many domains, especially in computer vision, where high-performing convolutional deep neural networks have reached and even surpassed human performance in many areas. To answer increasing needs in transparency for these black-box models the community of eXplainable AI have produced various techniques to explain their predictions. A popular way to do so for image data is via saliency maps. At the same time, objectively evaluating the quality of these techniques is not an easy task, due to the multifaceted nature of interpretability. In this work we perform a thorough comparative evaluation of six popular saliency map explainability techniques, namely LIME, SHAP, GradCAM, GradCAM++, IntGrad and SmoothGrad, using five quantitative function-grounded metrics present in literature, specifically fidelity, stability, identity, separability and time, on three commonly used benchmarking datasets, and three well-known model architectures, to determine pros and cons of each of the techniques. Though we find that no single technique dominates in all metrics, the obtained results show that IntGrad and SmoothGrad performed well on our fidelity and stability tests, with SHAP also achieving high results in fidelity. All techniques but LIME and SmoothGrad score highly on identity metric, and all but LIME - on separability, while GradCAM and GradCAM++ were by far the fastest. We also note the caveats we identified in the metrics, suggesting that more work is needed to gain a full picture of the quality of the different XAI techniques.
Graduation Thesis language
English
Graduation Thesis type
Master - Software Engineering
Supervisor(s)
Radwa Mohamed El Emam El Shawi
Defence year
2024
 
PDF