*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
You are cordially invited to my thesis defense scheduled on the 21st of November.
Title: Leveraging 3D information for controllable and interpretable image synthesis
Date: Mon, Nov 21st 2022
Time: 10:00 - 11:30 AM (EST)
Meeting Link:
https://gatech.zoom.us/j/99260310440
Join our Cloud HD Video Meeting
Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom Rooms is the original software-based conference room solution used around the world in board, conference, huddle, and training rooms, as well as executive offices and classrooms. Founded in 2011, Zoom helps businesses and organizations bring their teams together in a frictionless environment to get more done. Zoom is a publicly traded company headquartered in San Jose, CA.
gatech.zoom.us
Amit Raj
Machine Learning PhD Student
School of Electrical and Computer Engineering
Georgia Institute of Technology
Committee
Abstract:
Neural image synthesis has seen enormous advances in recent years, led by innovations in GANs which generate high-resolution, photo-realistic images. However, a major limitation of these methods is that they tend to capture texture statistics of an image with no explicit understanding of geometry. Additionally, GAN-only pipelines are notoriously hard to train. In contrast, recent trends in neural and volumetric rendering have demonstrated compelling results by incorporating 3D information into the synthesis pipeline using classical rendering techniques.
We leverage ideas from both classical graphics rendering and neural image synthesis to design 3D guided image generation pipelines that are photo-realistic, controllable, and easy to train. In this thesis, we discuss three sets of models that incorporate geometric information for controllable image synthesis.
1. Static geometries: We leverage class specific shape priors to present generative models that allow for 3D consistent novel view synthesis. To that end, we propose the first framework that allows for generalization of implicit representations to novel identities in the context of facial avatars.
2. Articulated Geometries: In the second section, we extend controllable synthesis to articulated geometries. We present two frameworks (with explicit and implicit geometric representations) for synthesis of pose and viewpoint controllable full body digital avatars.
3. Scenes: In the final section we present a framework for generation of driving scenes with both static and dynamic elements. In particular, the proposed model allows fine grained control over local elements of the scene without needing to resynthesize the entire scene, which we posit should reduce both the memory footprint of the model and inference times.