Reinforcement Learning Agents for Interacting with Humans

Ido Shapira, Amos Azaria

Research output: Contribution to conferencePaperpeer-review

2 Scopus citations

Abstract

We tackle the problem of an agent interacting with humans in a general-sum environment, i.e., a non-zero sum, non-fully cooperative setting, where the agent's goal is to increase its own utility. We show that when data is limited, building an accurate human model is very challenging, and that a reinforcement learning agent, which is based on this data, does not perform well in practice. Therefore, we propose that the agent should try maximizing a linear combination of the human's utility and its own utility rather than simply trying to maximize only its own utility. We provide a formula to compute what we believe to be the optimal trade-off for the ratio between the human's and the agent's utility when attempting to maximize the agent's utility. We show the performance of our proposed method in two different domains. That is, our proposed agent not only maximizes the social welfare of both the human and the autonomous agent, but performs significantly better than agents not accounting for the human's utility function in terms of the agent's own utility.

Original languageEnglish
Pages2079-2086
Number of pages8
StatePublished - 2022
Event44th Annual Meeting of the Cognitive Science Society: Cognitive Diversity, CogSci 2022 - Toronto, Canada
Duration: 27 Jul 202230 Jul 2022

Conference

Conference44th Annual Meeting of the Cognitive Science Society: Cognitive Diversity, CogSci 2022
Country/TerritoryCanada
CityToronto
Period27/07/2230/07/22

Keywords

  • Human modeling
  • Human-agent
  • Reinforcement Learning
  • interaction

Fingerprint

Dive into the research topics of 'Reinforcement Learning Agents for Interacting with Humans'. Together they form a unique fingerprint.

Cite this