Q-learning, reinforcement learning, multiagent systems
Q-learning is a reinforcement learning algorithm that learns expected utilities for state-action transitions through successive interactions with the environment. The algorithm's simplicity as well as its convergence properties have made it a popular algorithm for study. However, its non-parametric representation of utilities limits its effectiveness in environments with large amounts of perceptual input. For example, in multiagent systems, each agent may need to consider the action selections of its counterparts in order to learn effective behaviors. This creates a joint action space which grows exponentially with the number of agents in the system. In such situations, the Q-learning algorithm quickly becomes intractable. This paper presents a new algorithm, Dynamic Joint Action Perception, which addresses this problem by allowing each agent to dynamically perceive only those joint action distinctions which are relevant to its own payoffs. The result is a smaller joint action space and improved scalability of Q-learning to systems with many agents.
Original Publication Citation
Nancy Fulda and Dan Ventura, "Dynamic Joint Action Perception for Q-Learning Agents", Proceedings of the International Conference on Machine Learning and Applications, pp. 73-78, June 23.
BYU ScholarsArchive Citation
Fulda, Nancy and Ventura, Dan A., "Dynamic Joint Action Perception for Q-Learning Agents" (2003). All Faculty Publications. 493.
Physical and Mathematical Sciences
© 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Copyright Use Information