*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Abstract: Sparse statistical estimators are increasingly prevalent due to their ease of interpretability and superior out-of-sample performance. However, sparse estimation problems with an L0 constraint, restricting the support of the estimators, are challenging (typically NP-hard, but not always) non-convex optimization problems. Consequently, academicians and practitioners commonly turn to convex L1 proxies, such as Lasso and its variants, as a remedy. Although the L1 models are solved fast, they may lead to biased and/or dense estimators and require substantial cross-validation for calibration.
In this talk, we focus on two estimation problems: i) sparse regression and ii) sparse and smooth signal recovery. The first one is known to be NP-hard; we show that the second one is equivalent to a submodular minimization problem and, hence, is polynomially solvable. For both problems, we derive a sequence of strong convex relaxations. These relaxations are based on the ideal (convex-hull) formulations for rank-one/pairwise quadratic terms with indicator variables. The new relaxations can be formulated as conic quadratic or semidefinite optimization problems in an extended space; they are stronger and more general than the state-of-the-art models with the reverse Huber penalty and the minimax concave penalty functions. Furthermore, the proposed rank-one strengthening can be interpreted as a non-separable, non-convex, unbiased sparsity-inducing regularizer, which dynamically adjusts its penalty according to the shape of the estimation error function without inducing bias for the sparse solutions. Computational experiments with benchmark datasets show that the new conic formulations are solved fast and result in near-optimal estimators for non-convex L0-problems. Moreover, the resulting estimators also outperform alternative convex approaches from a statistical perspective, achieving high prediction accuracy and good interpretability.
The talk is based on the following recent papers with Andres Gomez and Shaoning Han:
https://atamturk.ieor.berkeley.edu/pubs/rank-one.pdf
https://atamturk.ieor.berkeley.edu/pubs/signal-estimation.pdf