{"id":862,"date":"2022-05-25T12:43:59","date_gmt":"2022-05-25T12:43:59","guid":{"rendered":"https:\/\/colloquium.cdm.depaul.edu\/?p=862"},"modified":"2022-05-25T12:43:59","modified_gmt":"2022-05-25T12:43:59","slug":"reinforcement-learning-for-non-stationary-discrete-time-linear-quadratic-mean-field-games-in-multiple-populations","status":"publish","type":"post","link":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/2022\/05\/25\/reinforcement-learning-for-non-stationary-discrete-time-linear-quadratic-mean-field-games-in-multiple-populations\/","title":{"rendered":"Reinforcement Learning for Non-stationary Discrete Time Linear-Quadratic Mean-Field Games in Multiple Populations"},"content":{"rendered":"\n<p><strong>When:<\/strong><\/p>\n\n\n\n<p>May 27, 2022 1:00 PM &#8212; 2:00 PM remotely via Zoom<\/p>\n\n\n\n<p><strong>Where:<\/strong><\/p>\n\n\n\n<p>CDM 222 and online<\/p>\n\n\n\n<p><strong>Speaker:<\/strong><\/p>\n\n\n\n<p>Muhammad Aneeq uz Zaman, University of Illinois at Urbana-Champaign, USA<\/p>\n\n\n\n<p><strong>Abstract:<\/strong><\/p>\n\n\n\n<p>Scalability of reinforcement learning algorithms to multi-agent systems is a significant bottleneck to their practical use. In this talk, we approach multi-agent reinforcement learning from a mean-field game perspective, where the number of agents tend to infinity. Our analysis focuses on the structured setting of systems with linear dynamics and quadratic costs, named linear-quadratic mean-field games, evolving over a discrete-time infinite horizon where agents are assumed to be partitioned into finitely-many populations connected by a network of known structure. The functional forms of the agents&#8217; costs and dynamics are assumed to be the same within populations, but differ between populations. We first characterize the equilibrium of the mean-field game which further prescribes an approximate-Nash equilibrium for the finite population game. Our main focus is on the design of a learning algorithm, based on zero-order stochastic optimization, for computing mean-field equilibria. The algorithm exploits the affine structure of both the equilibrium controller and equilibrium mean-field trajectory by decomposing the learning task into first learning the linear terms, and then learning the affine terms. We present a convergence proof and a finite-sample bound quantifying the estimation error as a function of the number of samples.<\/p>\n\n\n\n<p><strong>Bio:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/colloquium.cdm.depaul.edu\/wp-content\/uploads\/2022\/05\/IMG_20211125_183406__01.jpg\" alt=\"\" class=\"wp-image-863\" width=\"233\" height=\"298\" srcset=\"https:\/\/colloquium.cdm.depaul.edu\/wp-content\/uploads\/2022\/05\/IMG_20211125_183406__01.jpg 584w, https:\/\/colloquium.cdm.depaul.edu\/wp-content\/uploads\/2022\/05\/IMG_20211125_183406__01-234x300.jpg 234w\" sizes=\"auto, (max-width: 233px) 85vw, 233px\" \/><\/figure>\n\n\n\n<p>Muhammad Aneeq uz Zaman, is a PhD student in Mechanical Engineering department in University of Illinois, Urbana-Champaign. He is working under the supervision of Tamer Ba\u015far in the area of Multi-Agent Reinforcement Learning (MARL). In MARL he primarily works on Mean-Field Games (MFGs) which is a recent advance in game theory literature. In MFGs the interaction between a generic agent and the average behavior of other agents (mean-field) is considered. By computing certain structural properties of the mean-field the MARL problem becomes much easier to solve. His published works includes MARL over networks and multi-graphs among others. His research interests include real-world MARL problems in industry like advertisement pricing, congestion control and contagion over networks.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When: May 27, 2022 1:00 PM &#8212; 2:00 PM remotely via Zoom Where: CDM 222 and online Speaker: Muhammad Aneeq uz Zaman, University of Illinois at Urbana-Champaign, USA Abstract: Scalability of reinforcement learning algorithms to multi-agent systems is a significant bottleneck to their practical use. In this talk, we approach multi-agent reinforcement learning from a &hellip; <a href=\"https:\/\/colloquium.cdm.depaul.edu\/index.php\/2022\/05\/25\/reinforcement-learning-for-non-stationary-discrete-time-linear-quadratic-mean-field-games-in-multiple-populations\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Reinforcement Learning for Non-stationary Discrete Time Linear-Quadratic Mean-Field Games in Multiple Populations&#8221;<\/span><\/a><\/p>\n","protected":false},"author":10,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-862","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/posts\/862","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/comments?post=862"}],"version-history":[{"count":1,"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/posts\/862\/revisions"}],"predecessor-version":[{"id":864,"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/posts\/862\/revisions\/864"}],"wp:attachment":[{"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/media?parent=862"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/categories?post=862"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/colloquium.cdm.depaul.edu\/index.php\/wp-json\/wp\/v2\/tags?post=862"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}