Deep Reinforcement Learning (DRL) has been included into the production system for multiple objectives, including control, scheduling, and maintenance planning. Maintenance must be planned sensibly and economically in order to preserve the usable life of the production systems while not sacrificing productivity and so minimising costs and losses. In this work a hybrid simulation-based and DRL approach is employed to develop an agent that can autonomously determine when to do preventative maintenance by considering the failure probability at a particular instant and the length of time since the last maintenance operation has been performed. The novelty of this approach is the configuration of the DRL setting, in particular the reward function. Results are promising comparing the approach with a heuristic from the literature, as they show that the frequency of machine failures is dramatically reduced.
Deep Reinforcement Learning Approach for Maintenance Planning in a Flow-Shop Scheduling Problem / Marchesano, M. G.; Staiano, L.; Guizzi, G.; Castellano, D.; Popolo, V.. - 355:(2022), pp. 385-399. ( 21st International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques, SoMeT 2022 jpn 2022) [10.3233/FAIA220268].
Deep Reinforcement Learning Approach for Maintenance Planning in a Flow-Shop Scheduling Problem
Marchesano M. G.
Primo
;Guizzi G.;Castellano D.;Popolo V.
2022
Abstract
Deep Reinforcement Learning (DRL) has been included into the production system for multiple objectives, including control, scheduling, and maintenance planning. Maintenance must be planned sensibly and economically in order to preserve the usable life of the production systems while not sacrificing productivity and so minimising costs and losses. In this work a hybrid simulation-based and DRL approach is employed to develop an agent that can autonomously determine when to do preventative maintenance by considering the failure probability at a particular instant and the length of time since the last maintenance operation has been performed. The novelty of this approach is the configuration of the DRL setting, in particular the reward function. Results are promising comparing the approach with a heuristic from the literature, as they show that the frequency of machine failures is dramatically reduced.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


