TFRE: A Tuning-free Robust and Efficient Approach to High-dimensional Regression¶
[Wang2020] proposed the TFRE Lasso estimator for high-dimensional linear regressions with heavy-tailed errors as below: $$\widehat{\boldsymbol{\beta}}(\lambda^{*}) = \arg\min_{\boldsymbol{\beta}}\frac{1}{n(n-1)}{\sum\sum}_{i\neq j}\left|(Y_i-\boldsymbol{x}_i^T\boldsymbol{\beta})-(Y_j-\boldsymbol{x}_j^T\boldsymbol{\beta})\right| + \lambda^{*}\sum_{k=1}^p|\beta_k|,$$ where \(\lambda^{*}\) is the tuning parameter which can be estimated independent of errors. In [Wang2020], the following tuning parameter is suggested $$\lambda^{*} = const_{\lambda} * G^{-1}_{||\boldsymbol{S}_n||_\infty}(1-\alpha_0), $$ where \(\boldsymbol{S}_n = -2[n(n-1)]^{-1}\sum_{j=1}^n\boldsymbol{x}_j[2r_j-(n+1)]\), \(r_1,\ldots,r_n\) follows the uniform distribution on the permutations of the integers \(\{1,\ldots,n\}\), and \(G^{-1}_{||\boldsymbol{S}_n||_\infty}(1-\alpha_0)\) denotes the \((1-\alpha_0)\)-quantile of the distribution of \(||\boldsymbol{S}_n||_\infty\).
In this package, the TFRE Lasso model is fitted by QICD algorithm proposed in [PengWang2015]. To overcome the computational barrier arising from the U-statistics structure of the aforementioned loss function, we apply the Incomplete U-statistics resampling technique which was first proposed in [Clemencon2016].
[Wang2020] also proposed a second-stage enhancement by using the TFRE Lasso estimator \(\widehat{\boldsymbol{\beta}}(\lambda^{*})\) as an initial estimator. It is defined as: $$\widetilde{\boldsymbol{\beta}}^{(1)} = \arg\min_{\boldsymbol{\beta}}\frac{1}{n(n-1)}{\sum\sum}_{i\neq j}\left|(Y_i-\boldsymbol{x}_i^T\boldsymbol{\beta})-(Y_j-\boldsymbol{x}_j^T\boldsymbol{\beta})\right| + \sum_{k=1}^pp_{\eta}’( | \widehat{\beta}_{k} (\lambda^{*}) | )|\beta_k|,$$ where \(p'_{\eta}(\cdot)\) denotes the derivative of some nonconvex penalty function \(p_{\eta}(\cdot)\), \(\eta > 0\) is a tuning parameter. This function implements the second-stage enhancement with two popular nonconvex penalty functions: SCAD and MCP. The modified high-dimensional BIC criterion in [Wang2020] is employed for selecting \(\eta\). Define: $$HBIC(\eta) = \log\left\{{\sum\sum}_{i\neq j}\left|(Y_i-\boldsymbol{x}_i^T\widetilde{\boldsymbol{\beta}}_{\eta})-(Y_j-\boldsymbol{x}_j^T\widetilde{\boldsymbol{\beta}}_{\eta})\right|\right\} + | A_{\eta} | \frac{\log\log n}{n* const\_hbic}\log p,$$ where \(\widetilde{\boldsymbol{\beta}}_{\eta}\) denotes the second-stage estimator with the tuning parameter value \(\eta\), and \(|A_{\eta}|\) denotes the cardinality of the index set of the selected model. In this package, we select the value of \(\eta\) that minimizes HBIC(\(\eta\)).
Indices and tables¶
Reference¶
Lan Wang, Bo Peng, Jelena Bradic, Runze Li & Yunan Wu (2020) A Tuning-free Robust and Efficient Approach to High-dimensional Regression, Journal of the American Statistical Association, 115:532, 1700-1714, DOI: 10.1080/01621459.2020.1840989.
Bo Peng & Lan Wang (2015) An Iterative Coordinate Descent Algorithm for High-Dimensional Nonconvex Penalized Quantile Regression, Journal of Computational and Graphical Statistics, 24:3, 676-694, DOI: 10.1080/10618600.2014.913516.
Stephan Clemencon, Igor Colin, & Aurelien Bellet, (2016). Scaling-up empirical risk minimization: optimization of incomplete U-statistics. The Journal of Machine Learning Research, 17(1), 2682-2717.