Researchers: Johan Grönqvist, Christian Rosdahl, Olle Kjellqvist, Frida Heskebeck, Carolina Bergeling, Bo Bernhardsson, Anders Rantzer
There are many important applications where classical physics based models need to be combined with machine learning tools. A good example is in autonomous driving, where automotive industry have extensive experience of control technology such as ABS braking, cruise control and ESP systems for vehicle stabilization. This technology now needs to be combined with machine learning methods to analyze traffic situations and human behavior. To do this in a safe and robust manner, it is essential to understand how learning algorithms for discrete sequential decision-making can interact with continuous physics based dynamics. Many other applications can be found. In the energy sector, well established control solutions for power networks and generators are increasingly being combined with learning algorithms for consumer behavior and decision-making, to minimize costs and optimize efficiency. In medicine, standard practice for disease therapies is combined with expert systems and sequential decision-making for medical diagnosis.
In our collaboration project with Alexandre Proutiere at KTH the aim is to bridge the gap between machine learning and control engineering. These research fields have traditionally evolved more or less separately, but in recent years the intersections in terms of applications as well theoretical challenges have been growing. This project is concerned with sequential decision making in systems whose dynamics are initially unknown, i.e., with adaptive control or reinforcement learning. Statistical models are of fundamental importance in both areas, but while learning theory has been focused on sample complexity and regret, the corresponding control literature is discussing stability robustness and asymptotic performance. An important focus of our project is the tradeoff between exploration and exploitation, sometimes known as "dual control". The optimal tradeoff strategy can be formulated as the solution to a dynamic programming problem. We study properties of the solution as well as computational schemes. Optimal strategies are compared with common heuristics, both in control and reinforcement learning.
Funding: European Research Council and Wallenberg Autonomous Systems and Software Program