Safety through Intrinsically Motivated Imitation Learning - 20èmes Rencontres des Jeunes Chercheurs en Intelligence Artificielle Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Safety through Intrinsically Motivated Imitation Learning

Résumé

Deep Reinforcement Learning methods require a large amount of data to achieve good performance. This scenario can be more complex, handling real-world domains with high-dimensional state space. However, historical interactions with the environment can boost the learning process. Considering this, we propose in this work an imitation learning strategy that uses previously collected data as a baseline for density-based action selection. Then, we augment the reward according to the state likelihood under some distribution of states given by the demonstrations. The idea is to avoid exhaustive exploration by restricting state-action pairs and encourage policy convergence for states that lie in regions with high density. The adopted scenario is the pump scheduling for a water distribution system where real-world data and a simulator are available. The empirical results show that our strategy can produce policies that outperform the behavioral policy and offline methods, and the proposed reward functions lead to competitive performance compared to the real-world operation.
Fichier principal
Vignette du fichier
RJCIA22_paper18.pdf (1.09 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03765564 , version 1 (31-08-2022)

Identifiants

  • HAL Id : hal-03765564 , version 1

Citer

Henrique Donancio, Laurent Vercouter. Safety through Intrinsically Motivated Imitation Learning. 20èmes Rencontres des Jeunes Chercheurs en Intelligence Artificielle, Jun 2022, Saint-Etienne, France. ⟨hal-03765564⟩
34 Consultations
51 Téléchargements

Partager

Gmail Facebook X LinkedIn More