Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Assessing the Quality of Counterfactual Explanations with Large Language Models

Name

Julius Välja

Abstract

With the accelerating spread of machine learning models, the complexity and lack of transparency of current models has become a major source of concern. The field of Explainable AI focuses on finding methods that can uncover the inner logic of these models. One such method is counterfactual explanations, which seek to answer the question "How would the original situation need to be different to achieve a different prediction from the model?". However, the qualities that make a counterfactual explanation good are not fully understood and are difficult to quantify. In this thesis, a survey was used to gather a dataset of human-evaluated counterfactual explanations, with an array of qualities defined based on previous literature. This dataset was used to explore Large Language Models' (LLMs) ability to evaluate subjective qualities of counterfactual explanations with and without fine-tuning. The results showed that large LLMs exhibit 70\\% to 95\\% accuracy at this task, depending on the specific model and testing dataset. While smaller LLMs could be fine-tuned to achieve acceptable accuracy, they were generally significantly less capable. In addition, the effect of correlations between metrics was tested for and experiments performed to assess the feasibility of predicting user satisfaction as well as modelling individual preferences. These results pave the way for future research regarding the automatic evaluation of counterfactual explanations and the development of new search algorithms.

Graduation Thesis language

English

Graduation Thesis type

Bachelor - Computer Science

Supervisor(s)

Marharyta Domnich, Raul Vicente, Eduard Barbu

Defence year

2024

PDF

UT Institute of Computer Science Graduation Theses Registry

Assessing the Quality of Counterfactual Explanations with Large Language Models