Guaranteeing Control Requirements via Reward Shaping in Reinforcement Learning

IRIS

In addressing control problems such as regulation and tracking through reinforcement learning (RL), it is often required to guarantee that the acquired policy meets essential performance and stability criteria such as a desired settling time and steady-state error before deployment. Motivated by this, we present a set of results and a systematic reward-shaping procedure that: 1) ensures the optimal policy generates trajectories that align with specified control requirements and 2) allows to assess whether any given policy satisfies them. We validate our approach through comprehensive numerical experiments conducted in two representative environments from OpenAI Gym: the Pendulum swing-up problem and the Lunar Lander. Utilizing both tabular and deep RL methods, our experiments consistently affirm the efficacy of our proposed framework, highlighting its effectiveness in ensuring policy adherence to the prescribed control requirements.

Guaranteeing Control Requirements via Reward Shaping in Reinforcement Learning / De Lellis, Francesco; Coraggio, Marco; Russo, Giovanni; Musolesi, Mirco; di Bernardo, Mario. - In: IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY. - ISSN 1063-6536. - 32:6(2024), pp. 2102-2113. [10.1109/tcst.2024.3393210]

Guaranteeing Control Requirements via Reward Shaping in Reinforcement Learning

De Lellis, Francesco;Coraggio, Marco;Russo, Giovanni;Musolesi, Mirco;di Bernardo, Mario

2024

Abstract

In addressing control problems such as regulation and tracking through reinforcement learning (RL), it is often required to guarantee that the acquired policy meets essential performance and stability criteria such as a desired settling time and steady-state error before deployment. Motivated by this, we present a set of results and a systematic reward-shaping procedure that: 1) ensures the optimal policy generates trajectories that align with specified control requirements and 2) allows to assess whether any given policy satisfies them. We validate our approach through comprehensive numerical experiments conducted in two representative environments from OpenAI Gym: the Pendulum swing-up problem and the Lunar Lander. Utilizing both tabular and deep RL methods, our experiments consistently affirm the efficacy of our proposed framework, highlighting its effectiveness in ensuring policy adherence to the prescribed control requirements.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY
			
	Citazione
	
				Guaranteeing Control Requirements via Reward Shaping in Reinforcement Learning / De Lellis, Francesco; Coraggio, Marco; Russo, Giovanni; Musolesi, Mirco; di Bernardo, Mario. - In: IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY. - ISSN 1063-6536. - 32:6(2024), pp. 2102-2113. [10.1109/tcst.2024.3393210]
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2024_Guaranteeing_Control_Requirements_via_Reward_Shaping_in_Reinforcement_Learning.pdf solo utenti autorizzati Licenza: Non specificato Dimensione 8.23 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	8.23 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/993158

Citazioni

ND

4

3

social impact