Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
Document Type
Article
Publication Date
1-1-2018
Abstract
Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).
Keywords
Cervical cancer prognosis, Feature selection, Gene expression profiling, Machine learning, Meta-analysis, Potential gene signature
Divisions
fac_eng,fac_med,InstituteofBiologicalSciences
Funders
University of Malaya research grants with the project number of RP038C-15AET & BK041-2014
Publication Title
PeerJ
Volume
6
Publisher
PeerJ