Reinforcement Learning-Based Dynamic Order Recommendation for On-Demand Food Delivery
作者机构:Department of AutomationTsinghua UniversityBeijing 100080China School of Mechanical and Automotive EngineeringQingdao Hengxing University of Science and TechnologyQingdao 266100China MeituanBeijing 100015China
出 版 物:《Tsinghua Science and Technology》 (清华大学学报(自然科学版(英文版))
年 卷 期:2024年第29卷第2期
页 面:356-367页
核心收录:
学科分类:1304[艺术学-美术学] 12[管理学] 13[艺术学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 080203[工学-机械设计及理论] 0835[工学-软件工程] 0802[工学-机械工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported in part by the National Natural Science Foundation of China(No.62273193) Tsinghua University-Meituan Joint Institute for Digital Life,and the Research and Development Project of CRSC Research&Design Institute Group Co.,Ltd
主 题:on-demand food delivery order recommendation reinforcement learning actor-critic network long short term memory
摘 要:On-demand food delivery(OFD)is gaining more and more popularity in modern *** a kernel order assignment manner in OFD scenario,order recommendation directly influences the delivery efficiency of the platform and the delivery experience of *** paper addresses the dynamism of the order recommendation problem and proposes a reinforcement learning solution *** actor-critic network based on long short term memory(LSTM)unit is designed to deal with the order-grabbing conflict between different ***,three rider sequencing rules are accordingly proposed to match different time steps of the LSTM unit with different *** test the performance of the proposed method,extensive experiments are conducted based on real data from Meituan delivery *** results demonstrate that the proposed reinforcement learning based order recommendation method can significantly increase the number of grabbed orders and reduce the number of order-grabbing conflicts,resulting in better delivery efficiency and experience for the platform and riders.