*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Date: Friday, Aug. 12, 2022
Time: 2:30 pm (EDT)
Location: Zoom link
Enpeng Yuan
Machine Learning PhD Student
School of Industrial and Systems Engineering
Georgia Institute of Technology
Committee:
ABSTRACT
Dynamic pricing and idle vehicle relocation are important tools for addressing demand-supply imbalance that frequently arises in the ride-hailing markets. Although interrelated, pricing and relocation have largely been studied independently in the literature. Moreover, the current mainstream methodologies, optimization and reinforcement learning (RL), suffer from significant computational limitations. The optimization needs to be solved in real-time and often trades off model fidelity (hence solution quality) for computational efficiency. Reinforcement learning requires a large number of samples to be trained offline, and often struggles to achieve full coordination among the fleet. This thesis expands the research horizon and addresses the limitations of existing approaches.
Chapter 1 designs an optimization model for computing both pricing and relocation decisions. The model ensures reasonable waiting time for the riders by reducing or postponing the demand that is beyond the service capacity. The postponement is by giving out discounts to riders who are willing to wait longer in the system, thus leveling off the peak without pricing out riders. Experiments show that the model ensures short waiting time for the riders without compromising the benefits (revenue and total rides served) of the platform. The postponement helps serve more riders during mild imbalances when there are enough vehicles to serve postponed riders after the peak.
Chapter 2 presents a machine learning framework to tackle the computational complexity of optimization-based approaches. Specifically, it replaces the optimization with an optimization-proxy: a machine learning model which predicts its optimal solutions. To tackle sparsity and high-dimensionality, the proxy first predicts the optimal solutions on the aggregated level and disaggregates the predictions via a polynomial-time transportation optimization. As a consequence, the typical NP-Hard optimization is reduced to a polynomial-time procedure of prediction and disaggregation. This allows the optimization model to be considered at higher fidelity since it can be solved and learned offline. Experiments show that the learning + optimization approach is computationally efficient and outperforms the original optimization due to its higher fidelity.
Chapter 3 extends one step further from Chapter 2, refining the optimization-proxy by reinforcement learning (RL). Specifically, RL starts from the optimization-proxy and improves its performance by interacting with the system dynamics and capturing long-term effects that are beyond the capabilities of the optimization approach. In addition, RL becomes far easier to train starting from a good initial policy. This hybrid approach is computationally efficient in both online deployment and offline training stages, and outperforms optimization and RL by combining the strengths of both approaches. It is the first Reinforcement Learning from Expert Demonstration (RLED) framework applied to the pricing and relocation problems and one of the few RL models with a fully-centralized policy.