Safety through Intrinsically Motivated Imitation Learning

Henrique Donancio; Laurent Vercouter

Communication Dans Un Congrès Année : 2022

Safety through Intrinsically Motivated Imitation Learning

(1) , (1)

Henrique Donancio

Fonction : Auteur

Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes

Laurent Vercouter

Fonction : Auteur
PersonId : 18708
IdHAL : laurent-vercouter
IdRef : 060396024

Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes

Résumé

Deep Reinforcement Learning methods require a large amount of data to achieve good performance. This scenario can be more complex, handling real-world domains with high-dimensional state space. However, historical interactions with the environment can boost the learning process. Considering this, we propose in this work an imitation learning strategy that uses previously collected data as a baseline for density-based action selection. Then, we augment the reward according to the state likelihood under some distribution of states given by the demonstrations. The idea is to avoid exhaustive exploration by restricting state-action pairs and encourage policy convergence for states that lie in regions with high density. The adopted scenario is the pump scheduling for a water distribution system where real-world data and a simulator are available. The empirical results show that our strategy can produce policies that outperform the behavioral policy and offline methods, and the proposed reward functions lead to competitive performance compared to the real-world operation.

Mots clés

Learning from Demonstrations Deep Reinforcement Learning Pump Scheduling Optimization

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

RJCIA22_paper18.pdf (1.09 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Maxime Guériau : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03765564

Soumis le : mercredi 31 août 2022-12:12:29

Dernière modification le : vendredi 22 décembre 2023-15:16:05

Archivage à long terme le : jeudi 1 décembre 2022-19:06:08

Dates et versions

hal-03765564 , version 1 (31-08-2022)

Identifiants

HAL Id : hal-03765564 , version 1

Citer

Henrique Donancio, Laurent Vercouter. Safety through Intrinsically Motivated Imitation Learning. 20èmes Rencontres des Jeunes Chercheurs en Intelligence Artificielle, Jun 2022, Saint-Etienne, France. ⟨hal-03765564⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSA-ROUEN LITIS COMUE-NORMANDIE UNIROUEN UNILEHAVRE INSA-GROUPE RJCIA_2022

34 Consultations

51 Téléchargements

Safety through Intrinsically Motivated Imitation Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager