Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Calibration of Multi-Class Probabilistic Classifiers

Name

Kaspar Valk

Abstract

Classifiers, machine learning models that predict probability distributions over classes, are not guaranteed to produce realistic output. A classifier is considered calibrated if the produced output is in correspondence with the actual class distribution. Calibration is essential in safety-critical tasks where small deviations between the predicted probabilities and the actual class distribution can incur large costs. A common approach to improve the calibration of a classifier is to use a hold-out data set and a post-hoc calibration method to learn a correcting transformation for the classifier's output. This thesis explores the field of post-hoc calibration methods for classification tasks with multiple output classes: several existing methods are visualized and compared, and three new non-parametric post-hoc calibration methods are proposed. The proposed methods are shown to work well with data sets with fewer classes, managing to improve the state-of-the-art in some cases. The basis of the three suggested algorithms is the assumption of similar calibration errors in close neighborhoods on the probability simplex, which has been previously used but never clearly stated in the calibration literature. Overall, the thesis offers additional insight into the field of multi-class calibration and allows for the construction of more trustworthy classifiers.

Graduation Thesis language

English

Graduation Thesis type

Master - Computer Science

Supervisor(s)

Meelis Kull

Defence year

2022

PDF Extras

UT Institute of Computer Science Graduation Theses Registry

Calibration of Multi-Class Probabilistic Classifiers