Reinforcement Learning for Non-stationary Discrete Time Linear-Quadratic Mean-Field Games in Multiple Populations


May 27, 2022 1:00 PM — 2:00 PM remotely via Zoom


CDM 222 and online


Muhammad Aneeq uz Zaman, University of Illinois at Urbana-Champaign, USA


Scalability of reinforcement learning algorithms to multi-agent systems is a significant bottleneck to their practical use. In this talk, we approach multi-agent reinforcement learning from a mean-field game perspective, where the number of agents tend to infinity. Our analysis focuses on the structured setting of systems with linear dynamics and quadratic costs, named linear-quadratic mean-field games, evolving over a discrete-time infinite horizon where agents are assumed to be partitioned into finitely-many populations connected by a network of known structure. The functional forms of the agents’ costs and dynamics are assumed to be the same within populations, but differ between populations. We first characterize the equilibrium of the mean-field game which further prescribes an approximate-Nash equilibrium for the finite population game. Our main focus is on the design of a learning algorithm, based on zero-order stochastic optimization, for computing mean-field equilibria. The algorithm exploits the affine structure of both the equilibrium controller and equilibrium mean-field trajectory by decomposing the learning task into first learning the linear terms, and then learning the affine terms. We present a convergence proof and a finite-sample bound quantifying the estimation error as a function of the number of samples.


Muhammad Aneeq uz Zaman, is a PhD student in Mechanical Engineering department in University of Illinois, Urbana-Champaign. He is working under the supervision of Tamer Ba┼čar in the area of Multi-Agent Reinforcement Learning (MARL). In MARL he primarily works on Mean-Field Games (MFGs) which is a recent advance in game theory literature. In MFGs the interaction between a generic agent and the average behavior of other agents (mean-field) is considered. By computing certain structural properties of the mean-field the MARL problem becomes much easier to solve. His published works includes MARL over networks and multi-graphs among others. His research interests include real-world MARL problems in industry like advertisement pricing, congestion control and contagion over networks.

Leave a Reply

Your email address will not be published. Required fields are marked *