Description and Application of Gene Expression Data Analysis Method Barcode
Name
Sander Tars
Abstract
The main goals of this thesis is to assert whether gene expression data analysis method
Barcode oers improvement over the method fRMA and to visualise the dierence clearly.
First, descriptive part of this thesis focuses on the gene expression data analysis
method Barcode. Barcode is explained by presenting an overview of dierent Barcode
versions. For each version a description of functionalities and possible uses are given with
emphasis on new functionalities, compared to the older versions.
Second, practical part of this thesis compares Barcode and fRMA method(fRMA
method output is the starting point for Barcode analysis). To compare these two methods
human gene expression dataset of DNA microarray experiment results is used. The
dataset E-TAB-145 contains expression data from 158 human tissue samples. Tissue samples
are rst manually clustered to use as reference in comparison of these two methods.
Data is then analysed with both Barcode and fRMA. To visualise and compare the result
two statistical methods are separately used: Principal component analysis and Hierarchical
clustering. For the results of both statisical analysis methods a detailed analysis
is given. In the analysis it is concluded that Barcode really does oer an improvement
over fRMA. Barcode allows samples to be classied better into clusters - samples of the
same tissue type are separated better from other samples compared to fRMA.
Barcode oers improvement over the method fRMA and to visualise the dierence clearly.
First, descriptive part of this thesis focuses on the gene expression data analysis
method Barcode. Barcode is explained by presenting an overview of dierent Barcode
versions. For each version a description of functionalities and possible uses are given with
emphasis on new functionalities, compared to the older versions.
Second, practical part of this thesis compares Barcode and fRMA method(fRMA
method output is the starting point for Barcode analysis). To compare these two methods
human gene expression dataset of DNA microarray experiment results is used. The
dataset E-TAB-145 contains expression data from 158 human tissue samples. Tissue samples
are rst manually clustered to use as reference in comparison of these two methods.
Data is then analysed with both Barcode and fRMA. To visualise and compare the result
two statistical methods are separately used: Principal component analysis and Hierarchical
clustering. For the results of both statisical analysis methods a detailed analysis
is given. In the analysis it is concluded that Barcode really does oer an improvement
over fRMA. Barcode allows samples to be classied better into clusters - samples of the
same tissue type are separated better from other samples compared to fRMA.
Graduation Thesis language
English
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Anna Ufliand, Priit Adler
Defence year
2016