Ridge regression and its applications in genetic studies

Document Type

Article

Publication Date

4-8-2021

Abstract

With the advancement of technology, analysis of large-scale data of gene expression is feasible and has become very popular in the era of machine learning. This paper develops an improved ridge approach for the genome regression modeling. When multicollinearity exists in the data set with outliers, we consider a robust ridge estimator, namely the rank ridge regression estimator, for parameter estimation and prediction. On the other hand, the efficiency of the rank ridge regression estimator is highly dependent on the ridge parameter. In general, it is difficult to provide a satisfactory answer about the selection for the ridge parameter. Because of the good properties of generalized cross validation (GCV) and its simplicity, we use it to choose the optimum value of the ridge parameter. The GCV function creates a balance between the precision of the estimators and the bias caused by the ridge estimation. It behaves like an improved estimator of risk and can be used when the number of explanatory variables is larger than the sample size in high-dimensional problems. Finally, some numerical illustrations are given to support our findings.

Keywords

Ridge, Genome regression modeling, Ridge regression, Ridge estimation, Ridge parameter

Divisions

MathematicalSciences

Funders

National Research Foundation (NRF) of South Africa SARChI Research Chair (IFR170227223754),National Research Foundation (NRF) of South Africa SARChI Research Chair (UID:71199),National Research Foundation (NRF) of South Africa SARChI Research Chair (109214),Universiti Malaya (RP009B-13AFR),Universiti Malaya (IIRG009C-19FNW)

Publication Title

PLoS ONE

Volume

16

Issue

4

Publisher

Public Library of Science

Publisher Location

1160 BATTERY STREET, STE 100, SAN FRANCISCO, CA 94111 USA

This document is currently not available here.

Share

COinS