2OffRAN: Offline Off-Policy Reinforcement Learning for Safe Handover in O-RAN

Navarro, Annalisa; Botta, Alessio; Canonico, Roberto; Wang, Yizhou; Fitzek, Frank H. P.; Nguyen, Giang T.

doi:10.1109/icmlcn64995.2025.11140172

Handover is an inherent part of mobile communication systems to maintain the connectivity of every User Equipment (UE). The rapid growth in the number of connected UEs and the trend toward densely deployed Base Stations (BSs) raise significant challenges for handover procedures. The OpenRAN framework, with its open architecture, offers a transformative opportunity to leverage a data-driven approach, e.g., Reinforcement Learning (RL), for connection management. However, state-of-the-art solutions that utilize RL are typically designed for direct evaluation in running networks, leading to potential performance degradation. In this paper, we propose the 2OFFRAN framework, which combines offline training and Off-Policy Evaluation. 2OFFRAN collects Key Performance Metrics and UE-to-BS allocation data from real networks that run well-established handover algorithms. Subsequently, it employs the collected dataset to train a Deep Q Learning algorithm for more efficient connection management and builds a RAN dynamics model based on a Deep Neural Network to evaluate the RL algorithm's performance before its deployment. Results show that 2OFFRAN outperforms traditional handover strategies, improving throughput, user fairness, and load balancing while enhancing the safety of deploying RL in the RAN.

2OffRAN: Offline Off-Policy Reinforcement Learning for Safe Handover in O-RAN / Navarro, Annalisa; Botta, Alessio; Canonico, Roberto; Wang, Yizhou; Fitzek, Frank H. P.; Nguyen, Giang T.. - (2025), pp. 1-6. ( 2nd IEEE International Conference on Machine Learning for Communication and Networking, ICMLCN 2025 esp 2025) [10.1109/icmlcn64995.2025.11140172].