Visualizing the Contribution of Datapoints in a Binary Classifier's Loss

Name
Magnus Paal
Abstract
The decrease of a classifier's loss function's output is one way to know if a classifier is improving. The output of a loss function which is also known as loss is just one value and doesn't give a complete overview of the classifier and dataset as a whole. The aim of this thesis was to find a way how to interpret loss through datapoints and visualize it. The visualizations found can help to grasp how each datapoint contributes in the whole loss. These visualizations could be used to find out which sets of datapoints contribute the most in loss, the ones whose predicted value is farther from their actual value and which make up a smaller number of points, or those whose predicted value is closer to the actual value and which make up a bigger number of points. Secondly these visualizations could be used to compare the different losses of two classifiers and find out which datapoints are the ones that contribute most in that difference. Lastly the visualizations could be used to find out which datapoints with which features contribute the most in a loss.
Graduation Thesis language
Estonian
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Meelis Kull
Defence year
2021
 
PDF