Many regression problems exhibit a natural grouping among predictor variables. Examples are groups of dummy variables representing categorical variables, or present and lagged values of time series data. Since model selection in such cases typically aims for selecting groups of variables rather than individual covariates, an extension of the popular least angle regression (LARS) procedure to groupwise variable selection is considered. Data sets occurring in applied statistics frequently contain outliers that do not follow the model or the majority of the data. Therefore a modification of the groupwise LARS algorithm is introduced that reduces the influence of outlying data points. Simulation studies and a real data example demonstrate the excellent performance of groupwise LARS and, when outliers are present, its robustification.

Categorical variables, Model selection, Outliers, Time series
dx.doi.org/10.1016/j.csda.2015.02.007, hdl.handle.net/1765/87806
ERIM Top-Core Articles
Computational Statistics & Data Analysis
Erasmus School of Economics

Alfons, A, Croux, C, & Gelper, S.E.C. (2016). Robust groupwise least angle regression. Computational Statistics & Data Analysis, 93, 421–435. doi:10.1016/j.csda.2015.02.007