Analysis of Retweeting Behavior Using Topic Models
Name
Jose Santos
Abstract
Social networks are nowadays a constant presence in our lives and increasingly have a role in
important social and commercial phenomena. Microblogging services such as Twitter appear to
play an important role in the process of information dissemination on the Internet making it
possible for messages to spread virally in a matter of minutes. In this research work we study the
mechanism of re-broadcasting (called “retweeting”) information on Twitter; specifically we use
Latent Dirichlet Allocation to analyze users and messages in terms of the topics that compose
their text bodies and by means of ANOVA we are able to show that the topical distance between
users and messages is shorter for tweets that are retweeted than for those that are not. Using
Decision Tree learning we build several models in order to assess the accuracy and usefulness of
our topic-based model of retweeting. Our results show that our topic-based model slightly
outperforms a baseline prediction measure, so we conclude that such model is indeed a valid
option to consider for predicting retweet behavior with possibilities open for improvement.
Graduation Thesis language
English
Graduation Thesis type
Master - Information Technology
Supervisor(s)
Marlon Dumas
Defence year
2011