Understanding Segregation in the Baltic States Based on Person-to-Person Financial Transactions

Name
Hannula-Katrin Pandis
Abstract
Segregation between different populations has been studied from several perspectives for more than half a century. This work uses a quantitative approach, which includes the exploration of massive digital traces, to understand segregation instead of a traditional qualitative approach. We analyse a large dataset of person-to-person financial transactions from the Baltic states, Estonia, Latvia and Lithuania, to understand segregation as an interaction between individuals in these countries. We explore segregation at a country and a county level among several attributes. We, the authors, have closely cooperated with a financial institution having home markets in all three Baltic countries to analyse a large anonymised dataset, consisting of more than three million customers over five years, from 2017 to 2021. Furthermore, we compare the transaction dataset to Census data and conclude that the transaction dataset for gender attribute represents the population well in each Baltic county. Representativeness of preferred communication language and age group depends on the county. We analyse the transaction networks as social networks by calculating network characteristics on an undirected and unweighted graph to see how the transaction behaviour has changed within five years. We then calculate the values of two segregation indices, the Spectral Segregation Index and Coleman’s Homophily Index, to understand segregation based on formed interactions. Later, we use the transaction network by calculating graph embeddings using node2vec. After the number of dimensions of embeddings is reduced, we visualise the embeddings in a two-dimensional space using t-SNE. The results indicate that the most significant change in transaction behaviour within the five years has happened in Lithuania and the slightest change in Estonia. There are a few counties in each Baltic state where preferable communication language segregation exists of a minority group. In Latvia and Lithuania, there is a tendency for segregation of the age group of 15-19.
Graduation Thesis language
English
Graduation Thesis type
Master's exam - Data Science
Supervisor(s)
Rajesh Sharma, Jaan Übi
Defence year
2022
 
PDF