Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Automated Cognitive Distortion Detection and Classification of Reddit Posts Using Machine Learning

Name

Stanislav Sochynskyi

Abstract

A vicious circle of exaggerated thinking patterns, also known as cognitive distortions, can lead a person to anxiety and major depression. Automatic detection and classification of cognitive distortions can be beneficial for the initial mental health screening, the bet-ter use of counselling time, and improve accessibility of mental healthcare services. In this work, we apply logistic regression, Support Vector Machines (SVM), and fasttext classifiers to identify cognitive distortions in the real-world data from Reddit. For binary classification, the best F-score of 0.71 with the fasttext classifier. For multiclass classification task, the best F-score of 0.23 was achieved with Support Vector Machine (SVM) using tf-idf vectorisation. However, the metrics of some classes do not exceed the random chance baseline. A possible explanation is that the created dataset is sufficient to build a binary classifier, but more accurate models require more data to distinguish a larger number of classes. Additionally, we experimented with unsupervised clustering and topic modelling algorithms and did not find evidence that unsupervised methods could extract the patterns of cognitive distortions from a text. We developed an annotation guideline for manual annotation of cognitive distortions and applied it to annotate 2021 Reddit posts. We achieved kappa's score of 0.569 for binary case and 0.424 for multiclass case annotation, meaning moderate agreement between annotators. A higher number of classes leads to poorer consistency in annotation agreement, mainly due to overlapping definitions of cognitive distortions. Consequently, any automated methods can-not be expected to show high results in cognitive distortion classification.

Graduation Thesis language

English

Graduation Thesis type

Master - Innovation and Technology Management

Supervisor(s)

Kairit Sirts

Defence year

2021

PDF

UT Institute of Computer Science Graduation Theses Registry

Automated Cognitive Distortion Detection and Classification of Reddit Posts Using Machine Learning