Analysing Model Attacks in Machine Learning Pipeline
Name
Toluwani Mathew Elemosho
Abstract
Machine learning models have evolved significantly and are integral to various
sectors, including healthcare, finance, and transportation. However, their adoption has introduced vulnerabilities, particularly adversarial attacks that manipulate data to deceive these models. This thesis investigates the robustness of machine learning models against such attacks and explores the application of Explainable AI (XAI) techniques to enhance model transparency and security. Through a comprehensive literature review, this research identifies critical principles of trustworthy AI, including explainability, technical robustness, and human oversight. The methodology systematically analyzes adversarial attacks, employing techniques
like SHAP and LIME to evaluate model behaviour under different attack scenarios. The study also introduces a human-oversight dashboard designed to provide intuitive visualizations of these attacks, aiding in better understanding and mitigating vulnerabilities.
Experimental results highlight the effectiveness of XAI in identifying and explaining
adversarial manipulations, thereby improving the resilience of AI systems. User studies reveal significant findings regarding the role of explanations in AI systems. Compared to no explanations, short explanations significantly enhance user engagement, preference for information, satisfaction, textual clarity, and trustworthiness. However, increasing the length of explanations from short to long yields minimal additional benefits. These results suggest concise explanations are highly effective in fostering user trust and engagement with AI systems. This research contributes to the field by proposing robust defence mechanisms against adversarial attacks and emphasizing the role of human oversight in AI systems. It
underscores the necessity for transparent, explainable, and resilient AI models to ensure their safe and ethical deployment in critical applications.
sectors, including healthcare, finance, and transportation. However, their adoption has introduced vulnerabilities, particularly adversarial attacks that manipulate data to deceive these models. This thesis investigates the robustness of machine learning models against such attacks and explores the application of Explainable AI (XAI) techniques to enhance model transparency and security. Through a comprehensive literature review, this research identifies critical principles of trustworthy AI, including explainability, technical robustness, and human oversight. The methodology systematically analyzes adversarial attacks, employing techniques
like SHAP and LIME to evaluate model behaviour under different attack scenarios. The study also introduces a human-oversight dashboard designed to provide intuitive visualizations of these attacks, aiding in better understanding and mitigating vulnerabilities.
Experimental results highlight the effectiveness of XAI in identifying and explaining
adversarial manipulations, thereby improving the resilience of AI systems. User studies reveal significant findings regarding the role of explanations in AI systems. Compared to no explanations, short explanations significantly enhance user engagement, preference for information, satisfaction, textual clarity, and trustworthiness. However, increasing the length of explanations from short to long yields minimal additional benefits. These results suggest concise explanations are highly effective in fostering user trust and engagement with AI systems. This research contributes to the field by proposing robust defence mechanisms against adversarial attacks and emphasizing the role of human oversight in AI systems. It
underscores the necessity for transparent, explainable, and resilient AI models to ensure their safe and ethical deployment in critical applications.
Graduation Thesis language
English
Graduation Thesis type
Master - Software Engineering
Supervisor(s)
Huber Raul Flores Macario, Abdul-Rasheed Olatunji Ottun
Defence year
2024