A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-learning
February 2006
Research Paper
| Related Files |
|---|
|
(ERS 2006 006 LIS.pdf, 0.3MB) |
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This report provides a theoretical analysis of this issue. The analysis focuses on multi-agent Q-learning in iterated prisoner’s dilemmas. It is shown that under certain assumptions cooperative behavior may emerge when multi-agent Q-learning is applied in an iterated prisoner’s dilemma. An important consequence of the analysis is that multi-agent Q-learning may result in non-Nash behavior. It is found experimentally that the theoretical results derived in this report are quite robust to violations of the underlying assumptions.
- Cooperation
- Multi-Agent Q-Learning
- Multi-Agent Reinforcement Learning
- Nash Equilibrium
- Prisoner’s Dilemma
- O32 : Management of Technological Innovation and R&D
- C51 : Model Construction and Estimation
- M : Business Administration and Business Economics; Marketing; Accounting
- L15 : Information and Product Quality; Standardization and Compatibility
- agent
- strategy
- q-learning
- behavior
- value
- multi-agent q-learning
- iterated prisoner
- prisoner
- assumption
- action
- multi-agent
- dilemma
- boltzmann strategy
- state
- analysis
- result
- payoff
- defect
- experiment
- probability