EE-618 / 3 credits
Teacher: Cevher Volkan
Remark: Next time: Spring 2022
Every 2 years
This course describes theory and methods for decision making under uncertainty under partial feedback.
1. Introduction to the reinforcement learning (RL) paradigm
2. Overview of classical developments I: Markov Decision Process (MDP, POMDP), and Dynamic Programming (Value Iteration, Policy Iteration)
3. Overview of classical developments II: Monte-Carlo methods, TD-Learning, Q-Learning, SARSA (Model-based RL, and Model-free RL)
4. Stochastic Bandits and Thompson (posterior) Sampling
5. Bandit based RL algorithms (UCRL, UCAgg, UCCRL, REGAL) - Exploration and Exploitation
6. Policy Search (Policy gradient algorithms, variance reduction, TRPO algorithm)
7. Imitation Learning (Inverse Reinforcement Learning, Apprenticeship Learning)
Reinforcement learning, policy search.
Optimization, probability theory, mathematics of data.
In the programs
- Number of places: 20
- Exam form: Project report (session free)
- Subject examined: Theory and Methods for Reinforcement Learning
- Lecture: 28 Hour(s)
- Practical work: 14 Hour(s)