SoNIC:
Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning


The integration of adaptive conformal inference and constrained reinforcement learning enables social robots to exhibit courteous yielding behavior and make safe decisions in human crowds.



Abstract

Reinforcement learning (RL) enables social robots to generate trajectories without relying on human-designed rules or interventions, making it generally more effective than rule-based systems in adapting to complex, dynamic real-world scenarios. However, social navigation is a safety-critical task that requires robots to avoid collisions with pedestrians, whereas existing RL-based solutions often fall short of ensuring safety in complex environments. In this paper, we propose SoNIC, which to the best of our knowledge is the first algorithm that integrates adaptive conformal inference (ACI) with constrained reinforcement learning (CRL) to enable safe policy learning for social navigation. Specifically, our method not only augments RL observations with ACI-generated nonconformity scores, which inform the agent of the quantified uncertainty but also employs these uncertainty estimates to effectively guide the behaviors of RL agents by using constrained reinforcement learning. This integration regulates the behaviors of RL agents and enables them to handle safety-critical situations. On the standard CrowdNav benchmark, our method achieves a success rate of 96.93%, which is 11.67% higher than the previous state-of-the-art RL method and results in 4.5 times fewer collisions and 2.8 times fewer intrusions to ground-truth human future trajectories as well as enhanced robustness in out-of-distribution scenarios. To further validate our approach, we deploy our algorithm on a real robot by developing a ROS2-based navigation system. Our experiments demonstrate that the system can generate robust and socially polite decision-making when interacting with both sparse and dense crowds.


Key Ideas and Contributions


1) Framework with ACI and CRL: We develop a novel framework that integrates nonconformity scores generated by ACI with CRL, which not only enhances the observation of RL agents but also directly guides the learning process of RL agents.
2) Spatial Relaxtion: We propose a technique to increase the applicability of CRL in the context of social navigation by introducing spatial relaxation. Compared to previous methods, spatial relaxation provides richer cost feedback and facilitates convergence without sacrificing safety.
3) Performance Boost: Our method achieves state-of-the-art (SOTA) performance in social navigation in both safety and adherence to social norms, outperforming baselines by a large margin. Our method also shows much stronger robustness to out-of-distribution (OOD) scenarios.
4) Real Robot Deployment: We integrate our approach into a ROS2-based navigation system and deploy it on real robots with Mecanum kinematics. The robot equipped with our method shows polite and safe behaviors in both sparse and densely crowded environments.



Test Results in In-Distribution and OOD Settings


Quantitative Analysis: In this paper, we validated SoNIC in both in-distribution and OOD settings. The experimental results demonstrate that SoNIC achieves SOTA performance in both safety and adherence to social norms and shows strong robustness to OOD scenarios. Please refer to the paper for more details.


Citation