Feature extension of gut microbiome data for deep neural network-based colorectal cancer classification
Document Type
Article
Publication Date
1-1-2021
Abstract
Colorectal cancer (CRC) is the third most deadly cancer worldwide. The use of gut microbiome in early detection of the disease has attracted much attention from the research community, mainly because of its noninvasive nature. Recent achievements in next generation sequencing technology have led to increased availability of sequence data and enabled an environment for the growth of gut microbiome research. The use of conventional machine learning algorithms for automatic detection of CRC based on the microbiome is limited by factors such as low accuracy and the need for manual selection of features. Despite their success in other fields, Deep Neural Network (DNN) algorithms have limitations in microbiome-based CRC classification. These limitations include high dimensionality of microbiome data and other characteristics associated with sequence data such as feature dominance. In this paper, we propose a feature augmentation approach that aggregates data normalization methods to extend existing features of a dataset. The proposed method combines feature extension with data augmentation to improve CRC classification performance of a DNN model. The proposed model obtained area under the curve (AUC) scores of 0.96 and 0.89 on two publicly available microbiome datasets.
Keywords
Classification algorithms, Feature extraction, Microorganisms, Cancer, Neural networks, Data models, Sequential analysis, Colorectal cancer, deep neural network, feature dominance, gut microbiome, normalization, feature extension
Divisions
ai,fac_med,InstituteofBiologicalSciences
Funders
Malaysia's Ministry of Higher Education by the University of Malaya (TR001D-2018A)
Publication Title
IEEE Access
Volume
9
Publisher
Institute of Electrical and Electronics Engineers
Publisher Location
445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA