![]() Hence, the goal in ridge regression is to find the value of λ that balances the model between underfitting and overfitting, producing generalizable results. According to the idea of the bias-variance trade-off, the expected prediction error can be decomposed into the three components bias, variance and irreducible error. The multiplier λ controls the strength of the penalty, i.e. A common way of shrinkage is by ridge logistic regression where the penalty is defined as minus the square of the Euclidean norm of the coefficients multiplied by a non-negative complexity parameter λ. In theory, a straightforward approach to alleviate the problem would be to apply penalized maximum likelihood logistic regression: a penalty term that is added to the log likelihood function provides shrinkage of the coefficients towards zero, hereby decreasing the variance of the maximum likelihood estimates and stabilizing the predictions by pulling them towards the observed event rate. These attractive properties of the maximum likelihood logistic regression, however, vanish when the sample size is small or the prevalence of one of the two outcome levels (for some combination of exposure) is low, yielding coefficient estimates biased away from zero and very unstable predictions that generalize poorly on a new dataset from the same population. ![]() Thus, maximum likelihood logistic regression may be used for explanation or prediction, depending on context. interpretability of effect estimates, as well as accuracy of predictions given the covariates. For a dataset with similar prevalence of the two outcome levels and sufficient sample size, the maximum likelihood estimation of the regression coefficients facilitates inference, i.e. In medical research, logistic regression is commonly used to study the relationship between a binary outcome and a set of covariates. In contrast, determining the degree of shrinkage according to some meaningful prior assumptions about true effects has the potential to reduce bias and stabilize the estimates. ConclusionsĪpplying tuned ridge regression in small or sparse datasets is problematic as it results in unstable coefficients and predictions. In contrast, in our simulations pre-specifying the degree of shrinkage prior to fitting led to accurate coefficients and predictions even in non-ideal settings such as encountered in the context of rare outcomes or sparse predictors. As shown in our simulation and illustrated by a data example, values optimized in small or sparse datasets are negatively correlated with optimal values and suffer from substantial variability which translates into large MSE of coefficients and large variability of calibration slopes. Performance of ridge regression strongly depends on the choice of complexity parameter. We included ‘oracle’ models in the simulation study in which the complexity parameter was chosen based on the true event probabilities (prediction oracle) or regression coefficients (explanation oracle) to demonstrate the capability of ridge regression if truth was known. ![]() In addition to tuned ridge regression where the penalty strength is estimated from the data by minimizing some measure of the out-of-sample prediction error or information criterion, we also considered ridge regression with pre-specified degree of shrinkage. In this paper, we elaborate this issue further by performing a comprehensive simulation study, investigating the performance of ridge logistic regression in terms of coefficients and predictions and comparing it to Firth’s correction that has been shown to perform well in low-dimensional settings. There is evidence, however, that ridge logistic regression can result in highly variable calibration slopes in small or sparse data situations. For finite samples with binary outcomes penalized logistic regression such as ridge logistic regression has the potential of achieving smaller mean squared errors (MSE) of coefficients and predictions than maximum likelihood estimation.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |