Software for Clustering Using k-means Algorithms

Name
Joonas Puura
Abstract
In cluster analysis k-means method is a method popularly used for grouping data by their features. The method aims to minimize within-cluster sum of squared errors between data objects in clusters and their corresponding center means. Because solving k-means optimization task exactly is NP-hard there have been introduced several heuristic algorithms for finding approximations. As the goal of the thesis a software was made, which enables use of nine different algorithms, which are 5 k-means clustering algorithms and 4 methods for choosing initial centers. Using real life and synthetic datasets an overview of the application’s capabilities is given by measuring algorithms performance, memory use and approximation capabilities.
Graduation Thesis language
Estonian
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Jaak Vilo
Defence year
2016
 
PDF Extras