SoNIC:
Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning


A novel algorithm for social navigation that combines adaptive conformal inference and constrained reinforcement learning with spatial relaxation, achieving state-of-the-art results in generating safe robot trajectories that adhere to social norms.


SoNIC Performance in Different Environments

Abstract

Reinforcement Learning (RL) has enabled social robots to generate trajectories without human-designed rules or interventions, which makes it more effective than hardcoded systems for generalizing to complex real-world scenarios. However, social navigation is a safety-critical task that requires robots to avoid collisions with pedestrians while previous RLbased solutions fall short in safety performance in complex environments. To enhance the safety of RL policies, to the best of our knowledge, we propose the first algorithm, SoNIC, that integrates adaptive conformal inference (ACI) with constrained reinforcement learning (CRL) to learn safe policies for social navigation. More specifically, our method augments RL observations with ACI-generated nonconformity scores and provides explicit guidance for agents to leverage the uncertainty metrics to avoid safety-critical areas by incorporating safety constraints with spatial relaxation. Our method outperforms state-of-the-art baselines in terms of both safety and adherence to social norms by a large margin and demonstrates much stronger robustness to out-of-distribution scenarios.


Key Ideas and Contributions


1) Framework with ACI and CRL: We develop a novel framework that integrates nonconformity scores generated by ACI with CRL, which not only enhances the observation of RL agents but also directly guides the learning process of RL agents.
2) Spatial Relaxtion: We propose a technique to increase the applicability of CRL in the context of social navigation by introducing spatial relaxation. Compared to previous methods, spatial relaxation provides richer cost feedback and facilitates convergence without sacrificing safety.
3) Performance Boost: Our method achieves state-of-the-art (SOTA) performance in social navigation in both safety and adherence to social norms, outperforming baselines by a large margin. Our method also shows much stronger robustness to out-of-distribution (OOD) scenarios.



Test Results in In-distribution and OOD Settings


Quantitative Analysis: In this paper, we validated SoNIC in both in-distribution and OOD settings. The experimental results demonstrate that SoNIC achieves SOTA performance in both safety and adherence to social norms and shows strong robustness to OOD scenarios. Please refer to the paper for more details.


Citation