Ridge regression and its applications in genetic studies
Document Type
Article
Publication Date
4-8-2021
Abstract
With the advancement of technology, analysis of large-scale data of gene expression is feasible and has become very popular in the era of machine learning. This paper develops an improved ridge approach for the genome regression modeling. When multicollinearity exists in the data set with outliers, we consider a robust ridge estimator, namely the rank ridge regression estimator, for parameter estimation and prediction. On the other hand, the efficiency of the rank ridge regression estimator is highly dependent on the ridge parameter. In general, it is difficult to provide a satisfactory answer about the selection for the ridge parameter. Because of the good properties of generalized cross validation (GCV) and its simplicity, we use it to choose the optimum value of the ridge parameter. The GCV function creates a balance between the precision of the estimators and the bias caused by the ridge estimation. It behaves like an improved estimator of risk and can be used when the number of explanatory variables is larger than the sample size in high-dimensional problems. Finally, some numerical illustrations are given to support our findings.
Keywords
Ridge, Genome regression modeling, Ridge regression, Ridge estimation, Ridge parameter
Divisions
MathematicalSciences
Funders
National Research Foundation (NRF) of South Africa SARChI Research Chair (IFR170227223754),National Research Foundation (NRF) of South Africa SARChI Research Chair (UID:71199),National Research Foundation (NRF) of South Africa SARChI Research Chair (109214),Universiti Malaya (RP009B-13AFR),Universiti Malaya (IIRG009C-19FNW)
Publication Title
PLoS ONE
Volume
16
Issue
4
Publisher
Public Library of Science
Publisher Location
1160 BATTERY STREET, STE 100, SAN FRANCISCO, CA 94111 USA