Click here
Finite horizon MDPs
Stochastic shortest path problems
Discounted MDPs
RL foundations
Full state algorithms
Markov chains (review)
RL with function approximation
Policy gradient algorithms
Note: Bandits notes is highly incomplete.
Probability review
Singular value decomposition, principal component analysis
Bayes classifier, maximum likelihood estimation
Linear models
PAC learning
Crash course on optimization
Logistic regression
Support vector machines, kernel methods
Online learning
Mixture models
Poisson processes
DTMCs (transient behavior)
DTMCs (limiting behavior)
CTMCs