Reinforcement Learning for Non-stationary Discrete Time Linear-Quadratic Mean-Field Games in Multiple Populations

When:

May 27, 2022 1:00 PM — 2:00 PM remotely via Zoom

Where:

CDM 222 and online

Speaker:

Muhammad Aneeq uz Zaman, University of Illinois at Urbana-Champaign, USA

Abstract:

Scalability of reinforcement learning algorithms to multi-agent systems is a significant bottleneck to their practical use. In this talk, we approach multi-agent reinforcement learning from a mean-field game perspective, where the number of agents tend to infinity. Our analysis focuses on the structured setting of systems with linear dynamics and quadratic costs, named linear-quadratic mean-field games, evolving over a discrete-time infinite horizon where agents are assumed to be partitioned into finitely-many populations connected by a network of known structure. The functional forms of the agents’ costs and dynamics are assumed to be the same within populations, but differ between populations. We first characterize the equilibrium of the mean-field game which further prescribes an approximate-Nash equilibrium for the finite population game. Our main focus is on the design of a learning algorithm, based on zero-order stochastic optimization, for computing mean-field equilibria. The algorithm exploits the affine structure of both the equilibrium controller and equilibrium mean-field trajectory by decomposing the learning task into first learning the linear terms, and then learning the affine terms. We present a convergence proof and a finite-sample bound quantifying the estimation error as a function of the number of samples.

Bio:

Muhammad Aneeq uz Zaman, is a PhD student in Mechanical Engineering department in University of Illinois, Urbana-Champaign. He is working under the supervision of Tamer Başar in the area of Multi-Agent Reinforcement Learning (MARL). In MARL he primarily works on Mean-Field Games (MFGs) which is a recent advance in game theory literature. In MFGs the interaction between a generic agent and the average behavior of other agents (mean-field) is considered. By computing certain structural properties of the mean-field the MARL problem becomes much easier to solve. His published works includes MARL over networks and multi-graphs among others. His research interests include real-world MARL problems in industry like advertisement pricing, congestion control and contagion over networks.

Learning Better Ways to Measure and Move: Joint Optimization of an Agent’s Physical Design and Computational Reasoning

When:

May 20, 2022 1:00 PM — 2:00 PM remotely via Zoom

Where:

CDM 222 and online

Speaker:

Matthew R. Walter, Toyota Technological Institute of Chicago, USA

Abstract:

The recent surge of progress in machine learning foreshadows the advent of sophisticated intelligent devices and agents capable of rich interactions with the physical world. Many of these advances focus on building better computational methods for inference and control—computational reasoning methods trained to discover and exploit the statistical structure and relationships in their problem domain. However, the design of physical interfaces through which a machine senses and acts in its environment is as critical to its success as the efficacy of its computational reasoning. Perception problems become easier when sensors provide measurements that are more informative towards the quantities to be inferred. Control policies become more effective when an agent’s physical design permits greater robustness and dexterity in its actions. Thus, the problems of physical design and computational reasoning are coupled, and the answer to what combination is optimal naturally depends on the environment the machine operates in and the task before it.

I will present learning-based methods that perform automated, data-driven optimization over sensor measurement strategies and physical configurations jointly with computational inference and control. I will first describe a framework that reasons over the configuration of sensor networks in conjunction with the corresponding algorithm that infers spatial phenomena from noisy sensor readings. Key to the framework is encoding sensor network design as a differential neural layer that interfaces with a neural network for inference, allowing for joint optimization using standard techniques for training neural networks. Next, I will present a method that draws on the success of data-driven approaches to continuous control to jointly optimize the physical structure of legged robots and the control policy that enables them to locomote. The method maintains a distribution over designs and uses reinforcement learning to optimize a shared control policy to maximize the expected reward over the design distribution. I will then describe recent work that extends this approach to the coupled design and control of physically realizable soft robots. If time permits, I will conclude with a discussion of ongoing work that seeks to improve test-time generalization of the learned policies.

Bio:

Matthew R. Walter is an assistant professor at the Toyota Technological Institute at Chicago. His interests revolve around the realization of intelligent, perceptually aware robots that are able to act robustly and effectively in unstructured environments, particularly with and alongside people. His research focuses on machine learning-based solutions that allow robots to learn to understand and interact with the people, places, and objects in their surroundings. Matthew has investigated these areas in the context of various robotic platforms, including autonomous underwater vehicles, self-driving cars, voice-commandable wheelchairs, mobile manipulators, and autonomous cars for (rubber) ducks. Matthew obtained his Ph.D. from the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution, where his thesis focused on improving the efficiency of inference for simultaneous localization and mapping.

Muscle synergy analysis during exoskeleton assisted walking

When:

May 13, 2022 1:00 PM — 2:00 PM remote via Zoom

Where:

CDM 222 and online

Speaker:

Taimoor Afzal, Worcester Polytechnic Institute, USA

Abstract:

According to a recent UN report, about 1 in 6 people suffer from neurological disorders. That’s nearly 1 billion of the world’s population. Persons with neurological disorders lose the ability to control their arms and legs in a meaningful way. In some cases, the disorders lead to complete paralysis. This makes human movement control an extremely important area of research. It is hypothesized that the central nervous system activates a group of muscles to contribute to a particular movement thus reducing the dimensionality of muscle control. This concept is known as muscle synergies. However, this control is disrupted after an injury to the brain or spinal cord. The study of muscle synergies has great implications when studying movement control during disease states, such as stroke, multiple sclerosis, and spinal cord injury. In this talk, I will highlight the use of a mathematical model to extract muscle synergies from the electrical activity recorded from different muscles during walking. We will then examine how this model can be used to study movement patterns during walking after an injury. The application of this model during walking with and without an exoskeleton (robots that assist people to walk) will be demonstrated.

Bio:

Dr. Taimoor Afzal is an Assistant teaching professor in the Department of Biomedical Engineering at Worcester Polytechnic Institute. His research interests lie in human movement control, machine learning, and neural engineering. During his PhD, his research focused on developing a machine learning algorithm based on the notion of muscle synergies for classification of different walking modes. Later in his career, he worked at the University of Texas Health Science Center at Houston where he examined the feasibility of exoskeletons for assisted walking in patients with neurological disorders. Most recently, he was working as a postdoctoral researcher at Northwestern University, where he examined the mechanisms of muscle weakness in stroke and the bilateral effects of stroke on motoneuron excitability. 

Towards Scalable and Specialized Application Error Analysis

When:

May 6, 2022 1:00 PM — 2:00 PM remote via Zoom

Where:

CDM 222 and online

Speaker:

Abdulrahman Mahmoud, Harvard University, USA

Abstract:

We find ourselves at an exciting crossroad in processor design, where the slowing down of Moore’s Law has led to the rise of specialized architectures and accelerators. At the same time, however, the tiny transistors available to us are increasingly susceptible to errors in the field, due to various phenomena such as high energy particle strikes. Traditional reliability solutions aimed at identifying and mitigating such errors can be unnecessarily expensive. Is it possible to have the best of both worlds, where we can attain high error coverage while maintaining low overhead?

In this talk, I will address this challenge in the context of DNNs, due to their prevalence in many safety-critical tasks such as in self-driving cars. By understanding the effect of errors on the outcome of DNNs, we can leverage domain-specific insights to develop low-overhead reliability techniques and avoid the heavy hammer of traditional methods. In this talk, I will describe two selective protection techniques for DNNs which operate at different granularities, and also show how the combination can be better than the sum of its parts. I will conclude with a discussion of future research avenues which extend the concept of hardware error resiliency to general perturbations and computing anomalies.

Bio:

Abdulrahman is a postdoc researcher in computer science at Harvard University, working with Dr. David Brooks and Dr. Gu-Yeon Wei. His research interests are broadly in the areas of computer architecture, machine learning, reliability, and approximate computing. His work focuses on addressing the role hardware errors play on an application’s error tolerance, by designing tools and techniques to help understand how hardware errors propagate and affect software. Abdulrahman completed his PhD at UIUC under the guidance of Dr. Sarita Adve in the RSim Research Group. During his graduate studies, he was very fortunate to be the recipient of the Mavis Future Faculty Fellowship, to be invited to the 7th Heidelberg Laureate Forum, and to receive multiple awards for teaching and mentoring undergraduate students. Prior to joining UIUC, Abdulrahman completed his BSE from Princeton University, where he was the recipient of the John Ogden Bigelow Jr. Prize in Electrical Engineering. 

IoT and the Curse of Massive Wireless Connectivity: A Systems Outlook

When:

April 29, 2022 1:00 PM — 2:00 PM remote via Zoom

Where:

CDM 222

Speaker: Junaid Farooq, University of Michigan-Dearborn, USA

Abstract:

The Internet of Things (IoT) relies heavily on wireless communication-enabled devices that can discover and interact with other wireless devices in their vicinity. The communication flexibility coupled with software vulnerabilities in devices, due to low cost and short time-to-market, exposes them to a high risk of malware infiltration. An attacker might stealthily gain control over a large number of network devices using device-to-device (D2D) communication in order to launch a coordinated cyber-physical attack resulting in disruption of critical infrastructure facilities, or for malicious purposes such as collecting ransom. In this talk, I will describe an analytical approach to study the D2D propagation of malware in wireless IoT networks. Leveraging tools from dynamic population processes and point process theory, the malware infiltration and coordination process can be studied for a network topology. The analysis of mean-field equilibrium in the population is used for constructing and solving a network defense problem to prevent botnet formation by patching devices while causing minimum overhead to network operation. The proposed methodology serves as a basis for assisting the planning, design, and defense of such networks from a defender’s standpoint.

Bio:

 Junaid Farooq is an Assistant Professor of ECE at the University of Michigan-Dearborn. His research interests are broadly in the modeling, analysis, and optimization of wireless communication systems, cyber-physical systems, and the Internet of things (IoT). He received his Ph.D. in electrical engineering from NYU Tandon School of Engineering in Brooklyn. Prior to that, he obtained the M.S. and B.S. degrees in electrical engineering from the King Abdullah University of Science and Technology (KAUST), Saudi Arabia, and the National University of Sciences and Technology (NUST), Pakistan. He has also worked as a researcher at the Qatar Mobility Innovations Center (QMIC) in Doha, Qatar. During his time at NYU, he was awarded the Athanasios Papoulis Award and the Dante Youla Award for excellence in teaching and research, respectively. He was also the recipient of the NYU University wide Outstanding Dissertation Award in Technology and Applied Science in 2021.

How to use Stochastic Differential Equations for Deep Network based data analysis?

When:

April 22, 2022 1:00 PM — 2:00 PM in-person

Where:

CDM 222

Speaker: Sathya Ravi, University of Illinois at Chicago, USA

Abstract:

Starting with the work of Einstein and Smoluchowski on Brownian motion, the mathematical theory of Stochastic Differential Equations (SDEs) was built by Ito and Stratonovich to describe the fluctuating motion of microscopic particles such as atoms, and gas molecules. In the current times, as data centric decisions permeate our everyday lives, it may not be a stretch to imagine that mathematical results in SDE literature might be useful for computational purposes. In this talk, I will discuss my recent work on various ways of integrating SDEs during deep network training for various generalization, personalization, and computational benefits. We will focus on two instances of utilizing SDEs in the context of training feedforward architectures: (i) efficient feature augmentation; (ii) low memory personalization. In the first part of the talk, I will introduce the concept of generators of a SDE, and then describe in detail how generators can be used as a generic add-on to any layer in the neural network to adjust the predictions based on other samples in the minibatch. I will discuss various applications in small sample settings such as few shot learning, point cloud processing, and variational segmentation. Interestingly, all this can be accomplished without solving a single SDE. In the second part of the talk, I will switch focus to generative tasks such as image generation. Using Information Bottleneck Principle, I will explain a novel correspondence between noise level (time) and layers of encoders of a deep generative model that we have identified. We show that this correspondence can then be used to expand the range of a generator for downstream tasks, with intermediate iterates of the SDE which can be efficiently obtained. I will discuss experiments on image generation task, and advantages of using range expanded generators on a downstream denoising task.

Bio:

Dr. Ravi holds a Bachelor’s degree in Computer Engineering from NIT, Trichy, India, M.S in Industrial and Systems Engineering, MA in Mathematics, and PhD in Computer Science all from University of Wisconsin, Madison. Dr. Ravi is interested in Numerical Optimization of Deep Learning systems and in using Deep Learning to solve vision problems efficiently. For more information about Dr. Ravi’s research work please visit his research lab.

Formal Synthesis of Safety Controllers for Unknown Stochastic Control Systems

April 8, 2022 1:00 PM – 2:00 PM via Zoom

Speaker: Rameez Wajid, Computer Science Department, University of Colorado, Boulder.

Abstract:

Formal synthesis of controllers for stochastic control systems with unknown models is a challenging problem. We focus on safety controller synthesis for nonlinear stochastic control systems. The approach consists of a learning step followed by a controller synthesis scheme using control barrier functions. In the learning phase, we employ Gaussian processes (GP) to learn models of unknown stochastic control systems in the presence of both process and measurement noises. In the controller synthesis phase, we compute control barrier functions together with their corresponding controllers based on the learned GP and quantify lower bounds on the probabilities of safety satisfaction for the original unknown systems equipped with the synthesized controllers. Finally, the effectiveness of the proposed approach is illustrated on a room temperature control and a vehicle lane-keeping example.

Bio:

Rameez Wajid is a PhD student in the Computer Science Department at the University of Colorado Boulder. He is an Avionics Engineer by training and also has an MS degree in Control Systems. He has worked in both the industry and academia. During his industry stint, he helped develop low-cost Flight Simulation and In-flight Situational Awareness Systems. He also taught undergraduate avionics engineering classes for over three years at the National University of Sciences and Technology (NUST). His current research interests lie at the intersection of Control theory, Formal methods and machine learning.

ILLIXR: An Open Testbed to Enable Extended Reality Systems Research

April 1, 2022 1:00 PM – 2:00 PM via Zoom

Speaker: Muhammad Huzaifa, University of Illinois at Urbana-Champaign, USA

Abstract:

Extended reality (XR), including augmented, virtual, and mixed reality (AR/VR/MR), has the potential to transform our lives, but there is an orders of magnitude performance-power-quality gap between what is achievable today and our ideal XR systems. To enable research and development in this area, we have built ILLIXR – Illinois Extended Reality tested – the first open source XR system and testbed for XR systems research and development. Using ILLIXR, we have provided the first published results on performance/power/quality of an end-to-end XR system. These results show that systems of the next decade require, and ILLIXR enables, application-driven, end-to-end quality-of-experience driven, and hardware-software-application co-designed systems research. We have also launched the ILLIXR consortium, an industry backed consortium to democratize XR systems research, development, and benchmarking by creating a reference XR testbed based on ILLIXR, a benchmarking methodology, and a multidisciplinary XR systems research community. In this talk, we will describe ILLIXR, results from ILLIXR, the many research projects that ILLIXR is enabling, and the ILLIXR consortium.

Bio:

Muhammad Huzaifa is a PhD candidate in Computer Science at The University of Illinois at Urbana-Champaign where he works with Professor Sarita Adve on systems and architectures for Extended Reality. Together with his advisor, he leads the ILLIXR project and co-chairs the ILLIXR consortium. He is a recipient of an IEEE Micro Top Picks, an IISWC Best Paper Award, the Sohaib and Sara Abbasi Computer Science Fellowship, a Feng Chen Memorial Award, and an Outstanding Teaching Assistant award. He has a Bachelor of Engineering in Computer Engineering from McGill University.