*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
TITLE: Practicable robust Markov decision processes
ABSTRACT:
Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dynamic and stochastic environment. When the model parameters are subject to uncertainty, the "optimal strategy" obtained from MDP can significantly under-perform than the model's prediction. To address this, robust MDP has been developed which is based on worst-case analysis. However, several restrictions of the robust MDP model prevent it from practical success, which I will address in this talk. The first restriction of standard robust MDP is that the modeling of uncertainty is not flexible and can lead to conservative solution. In particular, it requires that the uncertainty set is "rectangular" - i.e., it is a Cartesian product of uncertainty sets of each state. To lift this assumption, we propose an uncertainty model which we call “k-rectangular" that generalizes the concept of rectangularity, and we show that this can be solved efficiently via state augmentation. The second restriction is that it does not take into account the learning issue - i.e., how to adapt the model in an efficient way to reduce the uncertainty. To address this, we devise an algorithm inspired by reinforcement learning that, without knowing the true uncertainty model, is able to adapt its level of protection to uncertainty, and in the long run performs as good as the minimax policy as if the true uncertainty model is known. Indeed, the algorithm achieves similar regret bounds as standard MDP where no parameter is adversarial, which shows that with virtually no extra cost we can adapt robust learning to handle uncertainty in MDPs.
Bio: Huan Xu received the B.Eng. degree in automation from Shanghai Jiaotong University, Shanghai, China in 1997, the M.Eng. degree in electrical engineering from the National University of Singapore in 2003, and the Ph.D. degree in electrical engineering from McGill University in 2009. From 2009 to 2010, he was a postdoctoral associate at The University of Texas at Austin. He was an assistant professor at the Department of Mechanical Engineering at the National University of Singapore from 2011 to 2015, and has moved to the Department of Industrial and Systems Engineering as an assistant professor since 2016. His research interests include machine learning, robust optimization, planning and control, and statistics. He has published in premium venues of operations research and of machine learning, including Operations Research, Math Programming, Mathematics of Operations Research, Journal of Machine Learning Research, IEEE Transaction of Information Theory, ICML, NIPS etc. He is an associate editor of IEEE Transactions on Pattern Analysis and Machine Intelligence and is on the editorial board of Computational Management Science.