TFRE: A Tuning-free Robust and Efficient Approach to High-dimensional Regression

[Wang2020] proposed the TFRE Lasso estimator for high-dimensional linear regressions with heavy-tailed errors as below: $$\widehat{\boldsymbol{\beta}}(\lambda^{*}) = \arg\min_{\boldsymbol{\beta}}\frac{1}{n(n-1)}{\sum\sum}_{i\neq j}\left|(Y_i-\boldsymbol{x}_i^T\boldsymbol{\beta})-(Y_j-\boldsymbol{x}_j^T\boldsymbol{\beta})\right| + \lambda^{*}\sum_{k=1}^p|\beta_k|,$$ where \(\lambda^{*}\) is the tuning parameter which can be estimated independent of errors. In [Wang2020], the following tuning parameter is suggested $$\lambda^{*} = const_{\lambda} * G^{-1}_{||\boldsymbol{S}_n||_\infty}(1-\alpha_0), $$ where \(\boldsymbol{S}_n = -2[n(n-1)]^{-1}\sum_{j=1}^n\boldsymbol{x}_j[2r_j-(n+1)]\), \(r_1,\ldots,r_n\) follows the uniform distribution on the permutations of the integers \(\{1,\ldots,n\}\), and \(G^{-1}_{||\boldsymbol{S}_n||_\infty}(1-\alpha_0)\) denotes the \((1-\alpha_0)\)-quantile of the distribution of \(||\boldsymbol{S}_n||_\infty\).

In this package, the TFRE Lasso model is fitted by QICD algorithm proposed in [PengWang2015]. To overcome the computational barrier arising from the U-statistics structure of the aforementioned loss function, we apply the Incomplete U-statistics resampling technique which was first proposed in [Clemencon2016].

[Wang2020] also proposed a second-stage enhancement by using the TFRE Lasso estimator \(\widehat{\boldsymbol{\beta}}(\lambda^{*})\) as an initial estimator. It is defined as: $$\widetilde{\boldsymbol{\beta}}^{(1)} = \arg\min_{\boldsymbol{\beta}}\frac{1}{n(n-1)}{\sum\sum}_{i\neq j}\left|(Y_i-\boldsymbol{x}_i^T\boldsymbol{\beta})-(Y_j-\boldsymbol{x}_j^T\boldsymbol{\beta})\right| + \sum_{k=1}^pp_{\eta}’( | \widehat{\beta}_{k} (\lambda^{*}) | )|\beta_k|,$$ where \(p'_{\eta}(\cdot)\) denotes the derivative of some nonconvex penalty function \(p_{\eta}(\cdot)\), \(\eta > 0\) is a tuning parameter. This function implements the second-stage enhancement with two popular nonconvex penalty functions: SCAD and MCP. The modified high-dimensional BIC criterion in [Wang2020] is employed for selecting \(\eta\). Define: $$HBIC(\eta) = \log\left\{{\sum\sum}_{i\neq j}\left|(Y_i-\boldsymbol{x}_i^T\widetilde{\boldsymbol{\beta}}_{\eta})-(Y_j-\boldsymbol{x}_j^T\widetilde{\boldsymbol{\beta}}_{\eta})\right|\right\} + | A_{\eta} | \frac{\log\log n}{n* const\_hbic}\log p,$$ where \(\widetilde{\boldsymbol{\beta}}_{\eta}\) denotes the second-stage estimator with the tuning parameter value \(\eta\), and \(|A_{\eta}|\) denotes the cardinality of the index set of the selected model. In this package, we select the value of \(\eta\) that minimizes HBIC(\(\eta\)).

Indices and tables

Reference

[Wang2020] (1,2,3,4)

Lan Wang, Bo Peng, Jelena Bradic, Runze Li & Yunan Wu (2020) A Tuning-free Robust and Efficient Approach to High-dimensional Regression, Journal of the American Statistical Association, 115:532, 1700-1714, DOI: 10.1080/01621459.2020.1840989.

[PengWang2015]

Bo Peng & Lan Wang (2015) An Iterative Coordinate Descent Algorithm for High-Dimensional Nonconvex Penalized Quantile Regression, Journal of Computational and Graphical Statistics, 24:3, 676-694, DOI: 10.1080/10618600.2014.913516.

[Clemencon2016]

Stephan Clemencon, Igor Colin, & Aurelien Bellet, (2016). Scaling-up empirical risk minimization: optimization of incomplete U-statistics. The Journal of Machine Learning Research, 17(1), 2682-2717.