*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Cooperation in Multi-Agent Reinforcement Learning
Date: Thursday, December 3rd, 2020
Time: 1:00 pm - 2:30 pm Eastern time
Location: https://bluejeans.com/684552748
Jiachen Yang
Machine Learning PhD Student
Computational Science and Engineering
Georgia Institute of Technology
As progress in deep reinforcement learning (RL) gives rise to increasingly general and powerful artificial intelligence, there is a possible future in which multiple RL agents must learn and interact in a shared multi-agent environment. When a single principal has oversight of the multi-agent system, how should agents learn to cooperate via centralized training to achieve individual and global objectives? Alternatively, when agents belong to many self-interested principals with imperfectly-aligned objectives, how can cooperation emerge from fully-decentralized learning?
In the first part of the thesis, we propose new algorithms for fully-cooperative multi-agent reinforcement learning (MARL) in the paradigm of centralized training with decentralized execution. Firstly, we propose a method based on multi-agent curriculum learning and multi-agent credit assignment to address the setting where global optimality is defined as the attainment of all individual goals. Secondly, we propose a hierarchical MARL algorithm to learn interpretable and useful skills for a multi-agent team to optimize a single shared reward.
In the second part, we propose learning algorithms to attain cooperation within a population of self-interested RL agents. We show that a new agent who is equipped with the new ability to incentivize other RL agents, and who explicitly accounts for the other agents' learning process, can overcome the challenging limitation of fully-decentralized training and generate emergent cooperation. Building on successful techniques in the completed work, we propose in the remaining work to address two complex applications of MARL: 1) the problem of incentive design for in silico experimental economics, where one wishes to optimize a global objective only by intervening on the rewards of a population of independent RL agents; 2) the problem of adaptive mesh refinement in the finite element method for solving large-scale physical simulations of complex dynamics.