Click here

Finite horizon MDPs

Stochastic shortest path problems

Discounted MDPs

RL foundations

Full state algorithms

Markov chains (review)

RL with function approximation

Policy gradient algorithms

Note: Bandits notes is highly incomplete.

Probability review

Singular value decomposition, principal component analysis

Bayes classifier, maximum likelihood estimation

Linear models

PAC learning

Crash course on optimization

Logistic regression

Support vector machines, kernel methods

Online learning

Mixture models

Poisson processes

DTMCs (transient behavior)

DTMCs (limiting behavior)

CTMCs