Toxicity in Google Play Store: What, Where and Why?

Name
Triin Pohla
Abstract
With an ever-growing user base, more and more people are using mobile applications and actively providing feedback on application stores, influencing app quality and user experiences. Despite ongoing efforts in moderating online content, offensive language in online comments is a common phenomenon. This thesis presents a large-scale study that explores the prevalence of toxicity in Google Play Store using nearly 60M application reviews from over 5800 applications over the span of nine years from January 2014 to January 2023. We finetune a RoBERTa-based multi-class toxic comment classifier that distinguishes between four types of reviews, including toxic and toxic-critical comments, with an accuracy of (88%). We find that, on average, 3.5% of all reviews contain toxic content and while the share of outright toxic comments has remained around 1% over the span of nearly a decade, the share of toxic-critical reviews shows subtle increase from below 2% of all reviews in 2014 to over 3% in January 2023. Major changes to the UX/UI or policies can increase the share of toxic-critical comments while the effect of external events, such as COVID-19 pandemic and Russian invasion of Ukraine, appear to have a limited contribution to toxic content in application reviews. This study contributes to the broader understanding of digital communication and user behavior, different facets of toxic content, and the implications for enhancing online platforms’ content moderation strategies and user engagement policies.
Graduation Thesis language
English
Graduation Thesis type
Master - Data Science
Supervisor(s)
Vigneshwaran Shankaran, Rajesh Sharma
Defence year
2023
 
PDF