Options
Grounding hindsight instructions in multi-goal reinforcement learning for robotics
Publikationstyp
Conference Paper
Publikationsdatum
2022-09
Sprache
English
Institut
Start Page
170
End Page
177
Citation
2022 IEEE International Conference on Development and Learning (ICDL) : 12-15 Sept. 2022 : proceedings. - Seite 170-177
Contribution to Conference
Publisher DOI
Scopus ID
ArXiv ID
Publisher
IEEE
Peer Reviewed
true
This paper focuses on robotic reinforcement learning with sparse rewards for natural language goal representations. An open problem is the sample-inefficiency that stems from the compositionality of natural language, and from the grounding of language in sensory data and actions. We address these issues with three contributions. We first present a mechanism for hindsight instruction replay utilizing expert feedback. Second, we propose a seq2seq model to generate linguistic hindsight instructions. Finally, we present a novel class of language-focused learning tasks. We show that hindsight instructions improve the learning performance, as expected. In addition, we also provide an unexpected result: We show that the learning performance of our agent can be improved by one third if, in a sense, the agent learns to talk to itself in a self-supervised manner. We achieve this by learning to generate linguistic instructions that would have been appropriate as a natural language goal for an originally unintended behavior. Our results indicate that the performance gain increases with the task-complexity.
Schlagworte
reinforcement learning
language grounding
instruction following
hindsight instruction
human-robot interaction
DDC Class
004: Informatik
600: Technik
620: Ingenieurwissenschaften
Funding Organisations