Click here
Finite horizon MDPs
Stochastic shortest path problems
Discounted MDPs
RL foundations
Full state algorithms
Markov chains (review)
RL with function approximation
Policy gradient algorithms
Note: Bandits notes is highly incomplete.
Introduction to zeroth-order optimization
Machine learning applications
Smoothness, convexity and strong convexity
Martingales
Introduction to stochastic approximation
Asymptotic convergence of stochastic approximation
Variants of zeroth-order gradient estimates
Bounds for gradient descent (noiseless case)
Non-asymptotics for stochastic gradient
Reinforcement learning
Probability review
Singular value decomposition, principal component analysis
Bayes classifier, maximum likelihood estimation
Linear models
PAC learning
Crash course on optimization
Logistic regression
Support vector machines, kernel methods
Online learning
Mixture models
Poisson processes
DTMCs (transient behavior)
DTMCs (limiting behavior)
CTMCs