ICLR2023

Graph-based Deterministic Policy Gradient for Repetitive Combinatorial Optimization Problems

Zhongyuan Zhao, Ananthram Swami, Santiago Segarra

摘要

Characters & Challenges 1. Network state of t+1 depends on the decisions at t 2. Cost vector c changes rapidly compared to network topology 3. Dynamic network topology 4. Practical restrictions: limited runtime and/or distributed execution weighted graph Nodes Edges Cost vector Applications … … … t t+1 t+2 Graph-based Markov decision process (MDP) Routing & Scheduling in communication networks Multi-object tracking in computer vision Vehicle routing problems in distribution networks Resource allocation & job scheduling in cloud, frog, edge computing