PhD Defense by Ashley Edwards

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Monday January 14, 2019
  12:30 pm - 2:30 pm
Location: TBA, College of Computing Building
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Emulation and Imitation via Perceptual Goal Specifications

Full Summary: No summary paragraph submitted.

Title: Emulation and Imitation via Perceptual Goal
Specifications

Ashley D. Edwards

Ph.D. Student

School of Interactive Computing

College of Computing

Georgia Institute of Technology

Date: Monday, January 14th, 2019

Time: 12:30 PM to 2:30PM (EST)

Location: TBA, College of Computing Building

Committee:

---------------

Dr. Charles Isbell (Advisor), School of
Interactive Computing, Georgia Institute of Technology

Dr. Tucker Balch, School of Interactive
Computing, Georgia Institute of Technology

Dr. Sonia Chernova, School of Interactive
Computing, Georgia Institute of Technology

Dr. Mark Riedl, School of Interactive
Computing, Georgia Institute of Technology

Dr. Pieter Abbeel, Department of Electrical
Engineering and Computer Sciences, University of California, Berkeley

Summary:

---------------

Much of the power behind reinforcement
learning is that we can use a single signal, known as the reward, to indicate
desired behavior. However, defining these rewards can often be difficult. This
dissertation introduces an alternative to the typical reward design mechanism.
In particular, we introduce four methods that allow one to focus on specifying
perceptual goals, rather than scalar rewards. By removing domain-specific
aspects of the problem, we demonstrate that goals can be expressed while being
agnostic to the reward function, action-space, or state-space of the agent’s
environment.

First, we will introduce perceptual reward
functions and describe how we can utilize a hand-defined similarity metric to
enable learning from goals that are different from the agent’s. We show how we
can use this method to train a simulated robot to learn from videos of humans.

Next, we will introduce cross-domain
perceptual reward functions and describe how we can learn a reward function for
cross-domain goal specifications. We show how we can use this method to train
an agent in a maze to reach goals specified through speech and hand gestures.

Next, we will introduce perceptual value
functions and describe how we can learn a value function from sequences of
expert observations without access to ground-truth actions. We show how we can
use this method to infer values from observation for a maze and pouring task,
and to train an agent to solve unseen levels within a platform game.

Finally, we will introduce latent policy
networks and describe how we can learn a policy from sequences of expert
observations without access to ground-truth actions. We show how we can use
this method to infer a policy from observation and train an agent to solve
classic control tasks and a platform game.

Additional Information

In Campus Calendar

Groups

Graduate Studies

Invited Audience

Faculty/Staff, Public, Graduate students, Undergraduate students

Categories

Other/Miscellaneous

Keywords

Phd Defense

Status

Created By: Tatianna Richardson
Workflow Status: Published
Created On: Jan 8, 2019 - 3:37pm
Last Updated: Jan 8, 2019 - 3:37pm

Georgia Tech

PhD Defense by Ashley Edwards

Additional Information