A novel method for detecting SNV genotypes from personal genome sequencing data

Name
Fanny-Dhelia Pajuste
Abstract
The genome variation studies are important for many areas like personal medicine, evolutionary analysis or bacterial strain identification. The single nucleotide variants (SNVs) are the most thoroughly studied variations in the genome, associated with different traits and diseases. Genomic studies depend greatly on the ability of detecting the allele variants of these variations present in personal genome. However, the methods used for calling SNV genotypes from personal sequencing data are not very fast nor reliable. The aim of this master's thesis was to develop a novel method for detecting SNV genotypes fast and reliably with a new approach that allows omitting the often error-prone step of read mapping used in the general variant calling pipelines. A k-mer based approach was introduced in this study for detecting SNV genotypes. A method was developed for using the unique k-mers covering the SNV locations for different allele variants to identify the genotypes of these SNVs. A program was created for compiling a list of unique k-mers for the allele variants of given SNVs and the method was tested using a program for detecting the genotype of these SNVs from the personal genome sequencing data. The method introduced in this study was tested on both simulated and real sequencing data and the memory and time usage was measured. Some recommendations were made for future work to reduce the time usage of the program as well as improving the detection of SNV genotypes.
Graduation Thesis language
English
Graduation Thesis type
Master - Computer Science
Supervisor(s)
Maido Remm
Defence year
2015
 
PDF