Parallel Pattern Discovery

Name
Egon Elbre
Abstract
An interesting research problem in dataset analysis is the discovery of patterns. Patterns can show how the dataset was formed and how it repeats itself. Due to the fast growth of data collection there is a need for algorithms that can scale with the data. In this thesis we examine how we can take an existing algorithm and make it parallel with three ideas: generalization, decomposition and reification of the existing algorithm. We apply these ideas to SPEXS, a pattern discovery algorithm, and generate a new algorithm SPEXS2, which we also implement. We also analyze several problems when implementing a generic algorithm. The ideas described could be used to parallelize other algorithms as well.
Graduation Thesis language
English
Graduation Thesis type
Master - Computer Science
Supervisor(s)
Jaak Vilo
Defence year
2013
 
PDF