KDD2022

Real-Time Rideshare Driver Supply Values Using Online Reinforcement Learning

Benjamin Han, Hyungjun Lee, Sébastien Martin

10 citations

Abstract

In this paper, we present Online Supply Values (OSV), a system for estimating the return of available rideshare drivers to match drivers to ride requests at Lyft. Because a future driver state can be accurately predicted from a request destination, it is possible to estimate the expected action value of assigning a ride request to an available driver as a Markov Decision Process using the Bellman Equation. These estimates are updated using temporal difference and are shown to adapt to changing marketplace conditions in real-time. While reinforcement learning has been studied for rideshare dispatch, fully-online approaches without offline priors or other guardrails had never been evaluated in the real world. This work presents the algorithmic changes needed to bridge this gap. OSV is now deployed globally as a core component of Lyft's dispatch matching system. Our A/B user experiments in major US cities measure a +(0.96±0.53)% increase in the request fulfillment rate and a +(0.73±0.22)% increase to profit per passenger session over the previous algorithm.