PhD Defense by Erik Wijmans

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Wednesday June 15, 2022
      12:00 pm - 2:00 pm
  • Location: Remote
  • Phone:
  • URL: Zoom
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Emergence of Intelligent Navigation Behavior in Embodied Agents from Massive-Scale Simulation

Full Summary: No summary paragraph submitted.

Title: Emergence of Intelligent Navigation Behavior in Embodied Agents from Massive-Scale Simulation

Erik Wijmans
Ph.D. Candidate in Computer Science
School of Interactive Computing
Georgia Institute of Technology
https://wijmans.xyz

Date and Time: 6/15 at 12:00pm ET
Location (virtual): https://gatech.zoom.us/j/98696852400?pwd=bmFxUWdIRUt6TEtId1hld3RrOGYxZz09
Committee:
Dhruv Batra (Advisor), School of Interactive Computing, Georgia Institute of Technology
Irfan Essa (Advisor), School of Interactive Computing, Georgia Institute of Technology
Sonia Chernova, School of Interactive Computing, Georgia Institute of Technology
Vladlen Koltun, Distinguished Scientist, Apple
Vincent Vanhoucke, Distinguished Scientist, Google
Gregory Wayne, Research Scientist, DeepMind

Summary
The goal of Artificial Intelligence (AI) is to build ‘thinking machines’ that ‘use language, form abstractions and concepts, solve kinds of problems now reserved for humans,and improve themselves.’ In this dissertation, we will argue that intelligence required for this goal emerges from massive-scale simulation. We will show that intelligent navigation behavior emerges from massive-scale simulation and deep reinforcement learning.

We introduce Decentralized Distributed PPO (DD-PPO), a method that scales reinforcement learning to multiple GPUs and machines. We use DD-PPO to train agents for PointGoal navigation for the equivalent of 80 years of human experience. This massive-scale training results in near-perfect autonomous navigation in an unseen environment without access to a map. We then examine the inner workings of special case of PointGoalNav agents. We find that (1) their memory enables shortcuts, i.e. efficiently travel through previously unexplored parts of the environment; (2) there is emergence of maps in their memory, i.e. a detailed occupancy grid of the environment can be decoded from it. We then introduce Variable Experience Rollout (VER), a method that efficiently scales reinforcement learning on a single GPU or machine. We use VER to train chained skills for mobile manipulation. We find a surprising emergence of navigation in skills that do not ostensibly require any navigation. Specifically, the pick skill involves a robot picking an object from a table. During training, the robot was always spawned close to the table and never needed to navigate. However, we find that if navigation actions are part of the action space, the robot learns to navigate then pick an object in new environments with 50% success, demonstrating surprisingly high out-of-distribution generalization.

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Faculty/Staff, Public, Undergraduate students
Categories
Other/Miscellaneous
Keywords
Phd Defense
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Jun 1, 2022 - 9:11am
  • Last Updated: Jun 1, 2022 - 9:11am