Stacking and chaining of normalization methods in deep learning-based classification of colorectal cancer using gut microbiome data
Document Type
Article
Publication Date
1-1-2021
Abstract
Machine learning (ML)-based detection of diseases using sequence-based gut microbiome data has been of great interest within the artificial intelligence in medicine (AIM) community. The approach offers a non-invasive alternative for colorectal cancer detection, which is based on stool samples. Considering limitations of existing methods in CRC detection, medical research has shown interest in the use of high throughput data to identify the disease. Owing to several limitations of conventional ML algorithms, deep learning (DL) methods are becoming more popular due to their outstanding performance in related fields. However, the performance of DL methods is affected by limitations such as dimensionality, sparsity, and feature dominance inherent in microbiome data. This research proposes stacking and chaining of normalization methods to address the limitations. While the stacking technique offers a robust, easy to use, and interpretable alternative for augmenting microbiome and other tabular data, the chaining technique is an alternative to data normalization that dynamically adjusts the underlying properties of data towards the normal distribution. The proposed techniques are combined with rank transformation and feature selection to further improve the performance of the model, with area under the curve (AUC) values between 0.857 to 0.987 using publicly available datasets.
Keywords
Data models, Feature extraction, Stacking, Cancer, Prediction algorithms, Classification algorithms, Sensitivity, Deep neural network, Colorectal cancer, Microbiome, Normalization, Augmentation, Stacking, Chaining
Divisions
fsktm
Funders
Malaysia's Ministry of Higher Education through the Research Grant by the University of Malaya, under the Trans-Discipline Research Grant Scheme (TR001D-2018A)
Publication Title
IEEE Access
Volume
9
Publisher
Institute of Electrical and Electronics Engineers
Publisher Location
445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA