ICLR2026
GRL-SNAM: Geometric Reinforcement Learning with Differential Hamiltonians for Navigation and Mapping in Unknown Environments
Aditya Sai Ellendula, Yi Wang, Minh Phuong Nguyen, Chandrajit L. Bajaj
被引用 2 次
摘要
We present GRL-SNAM, a geometric reinforcement learning framework for Simultaneous Navigation and Mapping (SNAM) in unknown environments. A SNAM problem is challenging as it needs to design hierarchical or joint policies of multiple agents that control the movement of a real-life robot towards the goal in mapless environment, i.e. an environment where the map of the environment is not available apriori, and needs to be acquired through sensors. The sensors are invoked from the path learner, i.e. navigator, through active query responses to sensory agents, and along the motion path. GRL-SNAM differs from preemptive navigation algorithms and other reinforcement learning methods by relying exclusively on local sensory observations without constructing a global map. Our approach formulates path navigation and mapping as a dynamic shortest path search and discovery process using controlled Hamiltonian optimization: sensory inputs are translated into local energy landscapes that encode reachability, obstacle barriers, and deformation constraints, while policies for sensing, planning, and reconfiguration evolve stagewise via updating Hamiltonians. A reduced Hamiltonian serves as an adaptive score function, updating kinetic/potential terms, embedding barrier constraints, and continuously refining trajectories as new local information arrives. We evaluate GRL-SNAM on two different 2D navigation tasks. To show our geometric RL policies naturally decomposes and bring hierchacy, we build a hyperelastic robot that learns to squeeze through narrow gaps, detour around obstacles, and generalize to unseen environments; To show GRL-SNAM is generalizable to indoor scene layout, we build a point-nav system in an unseen indoor maze layouts. Comparing against local reactive baselines (PF, CBF, staged DWA, staged PPO) and global policy learning references (A ⋆ , PPO, SAC) under identical stagewise sensing constraints, GRL-SNAM maintains path quality while using the minimal map coverage. It preserves clearance, generalizes to unseen layouts, and demonstrates that Geometric RL learning via updating Hamiltonians enables high-quality navigation through minimal exploration via local energy refinement rather than extensive global mapping. The code is publicly available on Github.