The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.

Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches

Ville Könönen

Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Department of Computer Science and Engineering for public examination and debate in Auditorium T1 at Helsinki University of Technology (Espoo, Finland) on the 3^rd of December, 2004, at 12 o'clock noon.

Overview in PDF format (ISBN 951-22-7359-4) [5749 KB]
Dissertation is also available in print (ISBN 951-22-7358-6)

Abstract

Modern computing systems are distributed, large, and heterogeneous. Computers, other information processing devices and humans are very tightly connected with each other and therefore it would be preferable to handle these entities more as agents than stand-alone systems. One of the goals of artificial intelligence is to understand interactions between entities, whether they are artificial or natural, and to suggest how to make good decisions while taking other decision makers into account. In this thesis, these interactions between intelligent and rational agents are modeled with Markov games and the emphasis is on adaptation and learning in multiagent systems.

Markov games are a general mathematical tool for modeling interactions between multiple agents. The model is very general, for example common board games are special instances of Markov games, and particularly interesting because it forms an intersection of two distinct research disciplines: machine learning and game theory. Markov games extend Markov decision processes, a well-known tool for modeling single-agent problems, to multiagent domains. On the other hand, Markov games can be seen as a dynamic extension to strategic form games, which are standard models in traditional game theory. From the computer science perspective, Markov games provide a flexible and efficient way to describe different social interactions between intelligent agents.

This thesis studies different aspects of learning in Markov games. From the machine learning perspective, the focus is on a very general learning model, i.e. reinforcement learning, in which the goal is to maximize the long-time performance of the learning agent. The thesis introduces an asymmetric learning model that is computationally efficient in multiagent systems and enables the construction of different agent hierarchies. In multiagent reinforcement learning systems based on Markov games, the space and computational requirements grow very quickly with the number of learning agents and the size of the problem instance. Therefore, it is necessary to use function approximators, such as neural networks, to model agents in many real-world applications. In this thesis, various numeric learning methods are proposed for multiagent learning problems.

The proposed methods are tested with small but non-trivial example problems from different research areas including artificial robot navigation, simplified soccer game, and automated pricing models for intelligent agents. The thesis also contains an extensive literature survey on multiagent reinforcement learning and various methods based on Markov games. Additionally, game-theoretic methods and methods originated from computer science for multiagent learning and decision making are compared.

This thesis consists of an overview and of the following 5 publications:

Könönen, V. J., 2004. Asymmetric multiagent reinforcement learning. Web Intelligence and Agent Systems: An International Journal (WIAS) 2, number 2, pages 105-121. © 2004 IOS Press. By permission.
Könönen, V. J., 2004. Gradient Descent for symmetric and asymmetric multiagent reinforcement learning. Helsinki University of Technology, publications in Computer and Information Science, Technical Report A78. © 2004 by author.
Könönen, V. J., 2004. Hybrid model for multiagent reinforcement learning. Proceedings of the International Joint Conference on Neural Networks (IJCNN-2004). Budapest, Hungary, 25-29 July 2004, pages 1793-1798. © 2004 IEEE. By permission.
Könönen, V. J., 2004. Policy gradient method for team Markov games. Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL-2004). Exeter, UK, 25-27 August 2004. Heidelberg, Springer-Verlag. Lecture Notes in Computer Science 3177, pages 733-739. © 2004 Springer-Verlag. By permission.
Könönen, V. J. and Oja, E., 2004. Asymmetric multiagent reinforcement learning in pricing applications. Proceedings of the International Joint Conference on Neural Networks (IJCNN-2004). Budapest, Hungary, 25-29 July 2004, pages 1097-1102. © 2004 IEEE. By permission.

Keywords: Markov games, reinforcement learning, Nash equilibrium, Stackelberg equilibrium, value function approximation, policy gradient

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

Last update 2011-05-26

The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.

Multiagent Reinforcement Learning in Markov Games: Asymmetric and Symmetric Approaches

Ville Könönen

Abstract