Understanding Gender Related Discussions in Android Mobile Applications Through Reddit

Name
Dariya Nagashibayeva
Abstract
A data-driven approach to the development of software applications aims to enhance the user experience and quality of life for broader groups of users. Taking into consideration the problems and needs of diverse communities is essential for creating safe and inclusive software. For this reason, understanding the discussions of such topics as gender in the software communities and identifying the inclusivity violations in software products through data analysis is crucial for improving the software design. The purpose of this thesis is to investigate the degree of contentment with popular Android mobile applications among the users of the social networking platform Reddit in terms of gender inclusiveness, explore the possibility of automated detection of gender discussions based on the preprocessed data, and evaluate the findings for suitability for improving software requirements. The research work presented in this thesis employed the quantitative and qualitative analysis of source data that included the data collection and manual annotation of keywords and Reddit posts. Application of the aforementioned methods resulted in
a new textual dataset on the topic of gender ready to use for data analysis. In addition, the thesis comprises experiments of using the dataset for the automated classification of Reddit posts and suggests which machine learning models, including state-of-the-art deep learning models, can be used for the detection of gender inclusiveness violations in software applications. The thesis also includes recommendations on how to deal with the limitations of such an automated classification approach in the future.
Graduation Thesis language
English
Graduation Thesis type
Master - Computer Science
Supervisor(s)
Kuldar Taveter, Tahira Iqbal
Defence year
2024
 
PDF Extras