Risk-Aware Multi-armed Bandits

Tutorial at SPCOM, 2022

Tutorial Description

The main purpose of this tutorial is to introduce and survey research results on risk-aware bandits, as well as to outline some promising avenues for future research following the risk-aware bandits framework. We consider both regret minimization and best-arm identification bandit formulations, where the traditional expected value performance measure is replaced with a risk measure. Some well-known examples of risk measures to be considered include variance (or higher moments), quantiles or value-at-risk (VaR), conditional value-at-risk (CVaR), utility-based shortfall risk (UBSR) and cumulative prospect theory (CPT).

Tutorial Outline

  • Tutorial overview

  • Review of multi-armed bandits

  • Risk measures

  • Risk estimation

  • Risk-aware bandits for regret minimization

  • Risk-aware bandits for best-arm identification

Slides

Click here

Survey article

Vincent Y. F. Tan, Prashanth L.A., and Krishna Jagannathan, A Survey of Risk-Aware Multi-Armed Bandits, International Joint Conference on Artificial Intelligence (IJCAI) (Survey Track), 2022. [arxiv]

Presenters

Krishna Jagannathan and Prashanth L.A.

Target Audience

This tutorial should be of interest to graduate students, researchers and practitioners alike, as it presents both the theory and the practical implementation of risk-aware bandit algorithms.