*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: The Use of 3D Matrix Multiplication in Chebyshev-Filtered Subspace Iteration
Date: Wednesday, May 11th
Time: 10:00 AM - 12:00 PM Eastern
Location (Virtual): https://bluejeans.com/124346323/7951
Location (Physical): Coda C1315 "Grant Park" (Coda access is required, contact Luke if you need guest access)
Lucas "Luke" Erlandson
School of Computational Science and Engineering
College of Computing
Georgia Institute of Technology
Committee:
Dr. Edmond Chow (Advisor, School of Computational Science and Engineering, Georgia Institute of Technology)
Dr. Felix Herrmann (School of Earth and Atmospheric Sciences, Georgia Institute of Technology)
Dr. Tobin Isaac (School of Computational Science and Engineering, Georgia Institute of Technology)
Dr. Ruipeng Li (Center for Applied Scientific Computing, Lawrence Livermore National Lab)
Dr. Yuanzhe Xi (Department of Mathematics, Emory University)
Abstract:
Electronic Structure calculations can be used to accurately calculate the motion and properties of electrons. These calculations require calculating (approximate) solutions to the Schrodinger Equation H Psi = E Psi, where H is the Hamiltonian operator, Psi is a wave function and E is the energy. Many domains require electronic structure calculations, including chemistry, material science, physics and many more. However, such calculations are prohibitively expensive to unless approximations are made. One method commonly used is known as Kohn-Sham Density Functional Theory due to its high accuracy cost to ratio. However, at the core of Kohn-Sham Density Functional Theory is the solution of an eigenvalue problem, which becomes intractable for large problems. Chebyshev-filtered subspace iteration (ChebFSI) is a method which reduces the cost associated with the eigensolve by instead refining a subspace via Chebyshev polynomials.
In this dissertation, an investigation into the computationally expensive kernels is conducted. These kernels include the matrix-matrix products, Hamiltonian, and eigensolve, and the investigation culminates into the high-performance parallel computation engine (libPCE), with particular focus on a distributed GPU implementation. Many of the matrices encountered within ChebFSI are highly non-square, and as such traditional distributed matrix-matrix products tend to perform inefficiently. Thus, we investigate the use of state-of-the-art matrix-matrix products, which aim to achieve higher efficiency in such cases. Furthermore, we investigate what is required to provide a high-performance distributed GPU code for the Hamiltonian and eigensolve. These routines are packaged in a way to be a replacement for computation routines currently used in DFT codes including the SPARC package (Simulation Package for Ab-initio Real-space Calculations).
The contributions of this dissertation are as follows: first, we investigate and provide justification for which of the available eigensolvers are useful for different cases, depending on the problems face and hardware available. Second, we investigate the use of matrix-matrix products compared to traditional on both CPU and GPU for the problems face. Third, we combine these investigations with a high-performance Hamiltonian implementation to provide a distributed GPU package. Finally, we demonstrate the efficacy of these developments through numerical experiments.