*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Thesis Title: Healthcare Data Analytics for Social Good
Advisors:
Dr. Turgay Ayer, School of Industrial and Systems Engineering, Georgia Tech
Committee members (ordered alphabetically):
Dr. Atalay Atasu, Technology and Operations Management Area, INSEAD
Dr. Bilal Gokpinar, School of Management, UCL
Dr. Daniel Montanera, Seidman College of Business, GVSU
Dr. Beril Toktay, Scheller College of Business, Georgia Tech
Dr. He Wang, School of Industrial and Systems Engineering, Georgia Tech
Date and Time: Monday, October 18, 2021, 12:00 pm (ET)
Location: Groseclose 304
Meeting URL: https://bluejeans.com/358777355/8544
Meeting ID: 358777355 (BlueJeans)
Abstract:
Healthcare problems, ranging from soaring medical costs to the COVID-19 pandemic, present major challenges to our society. Better solutions to these problems can potentially improve the lives and livelihood of tens of millions of people. This thesis consists of three essays on using healthcare data analytics to address pressing social challenges. Specifically, the first two essays focus on evaluation and improvement of risk adjustment designs in healthcare capitation programs, while the third essay develops a machine learning algorithm to detect county-level COVID-19 outbreaks.
In Chapter 2, we analyze a market design problem in Medicare Advantage (MA), the largest risk-adjusted capitation payment program in the U.S. healthcare market. There is evidence that that MA unintentionally incentivizes health plans to cherry pick profitable patient types, which is referred to as “risk selection". The existing literature primarily attributes the observed risk selection in MA market to data limitations and low explanatory power (e.g. low R^2) of the current risk adjustment design in the MA market. With the availability of big data and advancements in machine learning (ML) techniques, it is commonly believed that risk selection due to imperfect risk adjustment is expected to gradually disappear from the MA market. To examine this belief, we construct a game-theoretical model to study this problem. Surprisingly, our study shows that big data and ML alone cannot cure risk selection in the MA capitation program. More specifically, we show that even if the current MA risk adjustment design becomes informationally perfect (e.g. R^2=1) through availability of big data and advanced ML algorithms, health plans still have incentives to conduct risk selection through strategically subsidizing some subgroups of patients using capitation payments collected from other subgroups, which we call “risk selection induced by cross subsidization".
In Chapter 3, we empirically examine the theoretical model presented in Chapter 2. Specifically, we are interested in the following two empirical questions. First, can cross subsidization practice in MA be empirically identified? Second, is there an association between cross subsidization practice and the risk selection problem in MA? To answer these questions, we gain access to a large commercial insurance database containing claims from more than 2 million MA enrollees. By exploiting an exogenous policy shock on MA capitation payments through a Difference-in-Difference (DID) design, we identify, the first time in the literature, this reverse cross subsidization practice in MA. Furthermore, we show that the reverse cross subsidization practice is associated with the risk selection problem in MA, where low-risk patients are more likely to enroll in MA compared to the high-risk patients.
In Chapter 4, we develop a machine learning model to detect county-level COVID-19 outbreaks. Specifically, we resolve a practical challenge in outbreak detection to balance the speed and accuracy tradeoff of the detection. In particular, while estimation accuracy improves with longer
fitting windows, speed degrades. This paper presents a machine learning framework to balance this tradeoff using generalized random forests (GRF) and applies it to detect county level COVID-19 outbreaks. This algorithm chooses an adaptive fitting window size for each county based on relevant features affecting the disease spread, such as changes in social distancing policies. Experiment results show that our method outperforms any non-adaptive window size choices in 7-day ahead COVID-19 outbreak case number predictions.