Keywords

reinforcement learning, Q-learning, target sets

Abstract

Reinforcement learning agents that interact in a common environment frequently affect each others’ perceived transition and reward distributions. This can result in convergence of the agents to a sub-optimal equilibrium or even to a solution that is not an equilibrium at all. Several modifications to the Q-learning algorithm have been proposed which enable agents to converge to optimal equilibria under specified conditions. This paper presents the concept of target sets as an aid to understanding why these modifications have been successful and as a tool to assist in the development of new modifications which are applicable in a wider range of situations.

Original Publication Citation

Nancy Fulda and Dan Ventura, "Target Sets: A Tool for Understanding and Predicting the Behavior of Interacting Q-learners", Proceedings of the International Joint Conference on Information Sciences, pp. 1549-1552, September 23.