Markov Decision Processes as Distribution Transformers: Certified Policy Verification and Synthesis

Speaker:

S. Akshay

Organiser:

Shibashis Guha

Date:

Tuesday, 3 Sep 2024, 16:00 to 17:00

Venue:

A-201 (STCS Seminar Room)

Category:

STCS Seminar

event.ics Print

Abstract

Markov decision processes can be viewed as transformers of probability distributions, giving rise to a sequence of distributions over MDP states. This view is useful in many applications, e.g., modeling robot swarms or chemical reaction networks, where the behavior is naturally interpreted as probability distributions over states. Somewhat surprisingly, in this setting, basic reachability and safety problems turn out to be computationally intractable. The issue is further complicated by the question of how much memory is allowed: even for simple examples, policies for safety objectives over distributions can require infinite memory and randomization.

In light of this, we ask what can one do to tackle these problems in theory and in practice? After taking a look at some theoretical insights, we adopt an over-approximation route to approach these questions. Inspired by the success of invariant synthesis in program verification, we develop a framework for inductive template-based synthesis of certificates and policies for safety and reach-avoidance objectives in MDPs. We show the effectiveness of our approach as well as explore limitations and future perspectives.

[Based on Joint Work with Krishnendu Chatterjee, Tobias Meggendorfer and Djordje Zikelic at CAV'23 and IJCAI'24]

Short Bio:

S. Akshay is a Professor in the Department of Computer Science and Engineering at IIT Bombay. His research interests span formal methods and AI with a focus on automata theory, quantitative (timed/probabilistic) verification and automated synthesis. He has given multiple tutorials on his work in venues including Highlights 2022, AAAI 2022 and IJCAI 2022 as well as helped organize workshops on Automata, Concurrency and Timed systems in India and outside.