Cross-Lingual Misinformation Detection: Aligning English and Estonian Fake Health News

Li Merila
Health misinformation poses a significant threat as it undermines trust in scientific expertise and reduces compliance with public health measures, ultimately decreasing community resilience against preventable diseases. This thesis focuses on identifying Estonian fake health news by leveraging a pre-labelled dataset in English. The primary objective is to develop a reliable system for generating ground truth labels for fake health news in Estonian, contributing to the field of fake news detection in low-resource settings. The proposed approach, namely Cross-Lingual Alignment and Confident Prediction Sampling (CAPS), employs a hybrid two-phase methodology involving semantic similarity measurements, manual annotation, classification, and confidence sampling to create a novel fake health news dataset in Estonian.
Graduation Thesis language
Graduation Thesis type
Master - Data Science
Uku Kangur, Roshni Chakraborty
Defence year