Detecting Corruption in Public Procurement Through Open Data Analysis

Mart Kevin Põlluste
Corruption is present in all aspects of the society and it hinders the progress of various sectors of the economy. In this context, corruption is defined as the act of dishonesty for personal gain by those in power. One of the biggest sectors it influences is public procurement. Previous research has shown that corruption is present in public procurement and it reduces the transparency of the process. Taking into account the monetary value of the public procurement sector, it is clear that this is a problem that must be addressed. Various studies have used qualitative analysis to root out the core of the issue, but as it still thrives, it essential that more accurate and acute measures are used.
In order to tackle this problem, there have also been studies that try to quantify the likelihood of it, rather than only looking at qualitative research and this is where data analytics comes into play – the core of this study. This thesis aims to determine whether using open data resources and data analytics it is possible to classify corruption in the public procurement processes and therefore suggest a suitable set of data to make the detection of corruption easier and quicker. Building on existing work on corruption, it asks: what data could be analysed in classifying corruption and what methods could be used?
Based on a review of the literature on corruption and theories of machine learning, data analytics was used to assess possible corruption in public procurement in Estonia. In the data analytical process the author used machine learning approaches that predict the classification of procurement as corrupt or non-corrupt. The analysis of the results demonstrated that based on available data it is possible to predict corruption in public procurement in Estonia. Furthermore, the results also indicate that some features have a bigger impact on corruption in public procurement. Taking into account the background, related work and the current results, the author suggests that data analytics is vital in the fight against corruption and using machine learning can yield in good results in predicting corruption. Further research is needed to identify other factors that could strengthen the effectiveness of these approaches.
Graduation Thesis language
Graduation Thesis type
Master - Innovation and Technology Management
Rajesh Sharma
Defence year