Designing a Robust and Portable Workflow for Detecting Genetic Variants Associated with Molecular Phenotypes Across Multiple Studies

Nurlan Kerimov
Quantitative trait locus (QTL) analysis links variations in molecular phenotype expression levels to genotype variation. This analysis has become a standard practice to better understand molecular mechanisms underlying complex traits and diseases. Typical QTL analysis consists of multiple steps. Although a diverse set of tools is available to perform these individual analysis, the tools have so far not been integrated into a reproducible and scalable workflow that is easy to use across a wide range computational environments. Our analysis workflow consists of three modules. The analysis starts with quantification of the phenotype of interest, proceeds with normalisation and quality control and finishes with the QTL analysis. For phenotype quantification and QTL mapping modules we developed pipelines following best practices of the nf-core framework. The pipelines are containerized, open-source, extensible and eligible to be parallelly executed in a variety computational environments. For quality control module we developed a script which automatically computes the measures of quality and provides user with information. As a proof of concept, we uniformly processed more than 40 context specific groups from more than 15 studies and discovered at least one significant eQTL for more than 9000 genes. We believe that adopting our pipelines will increase reproducibility, portability and robustness of QTL analysis in comparison to existing approaches.
Graduation Thesis language
Graduation Thesis type
Master - Software Engineering
Kaur Alasoo
Defence year