Window Mean Payoff in Turn-Based Stochastic Games

Speaker:
Organiser:
Shibashis Guha
Date:
Thursday, 26 Feb 2026, 17:45 to 18:45
Venue:
A-201 (STCS Seminar Room)
Category:
Abstract

We look at turn-based stochastic games, which are two-player zero-sum games played on directed graphs in which each edge has a payoff associated with it and each vertex is either controlled by Player 1 or by Player 2, or is a probabilistic vertex. The game begins by placing a token on an initial vertex. Then, whenever the token is on a player-controlled vertex, that player chooses an out-edge and whenever the token is on a probabilistic vertex, an out-edge is chosen according to the underlying distribution, and in both cases, the token moves along the chosen edge. The play goes on in this manner forever, resulting in an infinite path in the graph, which in turn gives an infinite sequence of payoffs for Player 1. Fixing strategies of both players gives a distribution over plays in the graph. A well-studied objective in stochastic games is the mean-payoff objective, which requires that the average payoff per turn in the limit of the play be non-negative. In this thesis, we study a finitary version of the mean-payoff objective called the window mean-payoff objective. The window mean-payoff objective strengthens the mean-payoff objective by requiring that the average payoff become non-negative in every sliding window (of bounded length) of the play. In particular, we see algorithms for the following decision problems:
- Satisfaction: Does Player 1 have a strategy to ensure, with probability at least p, that the window mean-payoff value of the play is non-negative?
- Expectation: Does Player 1 have a strategy to ensure that the window mean-payoff value of the play is non-negative in expectation?
- Optimizing expectation with guarantees: Does Player 1 have a strategy to simultaneously ensure that the window mean-payoff value of the play is non-negative surely (or almost-surely or limit-surely) and is greater than a given threshold in expectation?
- Sure-almost-sure satisfaction (in Markov Decision Processes (MDPs)): Does Player 1 have a strategy to simultaneously ensure that the window mean-payoff value of the play is non-negative surely and is greater than a given threshold almost-surely? (MDPs are a special case of stochastic games with no Player 2 controlled vertices.)
Moreover, whenever a winning strategy exists for a player, we show how to construct the strategy and also give bounds on the amount of memory required to define the strategy.