*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
This is one of a series of talks that are given by Professor Chen. The full list of his talks is as follows:
Wednesday, August 28, 2019; 11:00 am - 12:00 pm; Groseclose 402
Thursday, August 29, 2019; 11:00 am - 12:00 pm; Groseclose 402
Tuesday, September 3, 2019; 11:00 am - 12:00 pm; Main - Executive Education Room 228
Wednesday, September 4, 2019; 11:00 am - 12:00 pm; Main - Executive Education Room 228
Thursday, September 5, 2019; 11:00 am - 12:00 pm; Groseclose 402
Check https://triad.gatech.edu/events for more information.
For location information, please check https://isye.gatech.edu/about/maps-directions/isye-building-complex
Title of this talk: Random initialization and implicit regularization in nonconvex statistical estimation
Abstract: Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation/learning problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require suitable initialization and proper regularization (e.g., trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, the prior theory is often either far from optimal or completely lacks theoretical
guarantees.
This talk is concerned with a striking phenomenon arising in two nonconvex problems (i.e. phase retrieval and matrix completion): even in the absence of careful initialization, proper saddle escaping, and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving near-optimal statistical and computational guarantees at once. All of this is achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent achieves near-optimal entrywise error control.
This is joint work with Cong Ma, Kaizheng Wang, Yuejie Chi, and Jianqing Fan