Learning DNA mutational signatures using neural networks

Lauri Tammeveski
All cancers are caused by mutations in the cells of an organism. It
is found that these mutations result from the combination of specific
mutational signatures, which often have known underlying processes.
That is why learning these signatures is important — it can give better
information about the mechanisms of cancers and also be helpful for
cancer prevention and therapy. The aim of this thesis is to test and
compare different methodology to improve the discovery of mutational
signatures. In particular, we compared three new methods of neural
networks (NN), rectified factor networks (RFN) and topic modelling
to the currently used non-negative matrix factorization (NMF). We
experimented with the methods on three organic and three synthetic
data sets by measuring reconstruction error, sparsity and time taken
and compared them with NMF. The results show that NMF produces
the smallest error on easier data sets, but error of RFN is comparably
good also and on all other data sets produces the best result. NN
performs equally well with RFN on more difficult data sets and overall
produces the sparsest results. The advantage of NMF is the stability
functionality that determines very well the correct number of signatures.
Future work will be needed to add this capability to RFN and
NN methods which would enable their practical use for the problem of
finding mutational signatures.
Graduation Thesis language
Graduation Thesis type
Master - Computer Science
Raul Vicente Zafra, Leopold Parts, Tambet Matiisen, Ardi Tampuu
Defence year