BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1461
DTSTAMP:20250715T083635Z
SUMMARY:Regret minimization in stochastic multi-armed bandits
DESCRIPTION:Speaker: Agniv Bandyopadhyay (TIFR)\n\nAbstract: \nIn the stoch
 astic K-armed bandit framework\, we are given K unknown distributions or a
 rms. At a given time\, we can select one arm and can observe one sample fr
 om that arm. Our goal is to maximize the reward over a finite time horizon
 \, which is also equivalent to minimizing the regret. Regret minimization 
 is an important aspect of many applications where we have to make sequenti
 al decisions under uncertainty and optimize for some objective\, for examp
 le\, clinical trials\,  recommendation systems\, selecting the best portf
 olio in a financial market\, etc. We will analyze some basic regret minimi
 zing algorithms\, such as: explore-then-commit\, successive reject\, upper
  confidence bound\, etc. We will also derive an information-theoretic lowe
 r bound on regret. \n \nThe talk will be based on the following referenc
 es: \n1. Lattimore\, Tor\, and Csaba Szepesvári. Bandit algorithms. Camb
 ridge University Press\, 2020\, Chapter 6-7.\n2. Kaufmann\, Emilie. Contri
 butions to the Optimal Solution of Several Bandit Problems. Diss. Universi
 té de Lille\, 2020\, Chapter 1.\n
URL:https://www.tcs.tifr.res.in/web/events/1461
DTSTART;TZID=Asia/Kolkata:20240809T160000
DTEND;TZID=Asia/Kolkata:20240809T170000
LOCATION:A-201 (STCS Seminar Room)
END:VEVENT
END:VCALENDAR
