PhD Defense by Zhibo Dai

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Thursday April 16, 2020 - Friday April 17, 2020
  2:00 pm - 2:59 pm
Location: REMOTE: BLUE JEANS
Phone:
URL: BlueJeans Link
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Spectrum Reconstruction Technique and Improved Naive Bayes Models for Text Classification Problems

Full Summary: No summary paragraph submitted.

I’m Zhibo Dai, the 5^th year math PhD student at Georgia Tech in the school of math. I’ll take my defense on 4/16 afternoon between 2pm ET and 3pm ET at Bluejeans meeting 866242745.

My thesis title is Spectrum Reconstruction Technique and Improved Naive Bayes Models for Text Classification Problems. The abstract and committee information are as follows:

Abstract
This thesis studies two topics. In the first part, we study the spectrum reconstruction technique. As is known to all, eigenvalues play an important role in many research fields and are foundation to many practical techniques such like PCA (Principal Component Analysis). We believe that related algorithms should perform better with more accurate spectrum estimation. There was an approximation formula proposed by Prof. Matzinger. However, they didn't give any proof. In our research, we show why the formula works. And when both number of features and dimension of space go to infinity, we find the order of error for the approximation formula, which is related to a constant C-the ratio of dimension of space and number of features.

In the second part, we focus on some applications of Naive Bayes models in text classification problems. Especially we focus on two special situations: 1) there is insufficient data for model training; 2) partial label problem. We choose Naive Bayes as our base model and do some improvement on the model to achieve better performance in those two situations. To improve model performance and to utilize as many information as possible, we introduce a correlation factor, which somehow relax the conditional independence assumption of Naive Bayes. The new estimates are biased estimation compared to the traditional Naive Bayes estimate, but have much smaller variance, which give us a better prediction result.

Committee

Prof. Heinrich Matzinger – School of Mathematics (advisor)
Prof. Federico Bonetto– School of Mathematics

Prof. Wenjing Liao – School of Mathematics
Prof. Tuo Zhao – School of Industrial and Systems Engineering
Prof. Ionel Popescu – School of Mathematics

Additional Information

In Campus Calendar

Groups

Graduate Studies

Invited Audience

Faculty/Staff, Public, Graduate students, Undergraduate students

Categories

Other/Miscellaneous

Keywords

Phd Defense

Status

Created By: Tatianna Richardson
Workflow Status: Published
Created On: Apr 3, 2020 - 9:36am
Last Updated: Apr 3, 2020 - 9:36am

Georgia Tech

PhD Defense by Zhibo Dai

Related Links

Additional Information