EE-568 / 6 crédits

Enseignant: Cevher Volkan

Langue: Anglais


Summary

This course describes theory and methods for Reinforcement Learning (RL), which revolves around decision making under uncertainty. The course covers classic algorithms in RL as well as recent algorithms under the lens of contemporary optimization.

Content

Keywords

Reinforcement Learning (RL)
Markov Decision Process (MDP)
Dynamic Programming
Linear Programming
Policy Gradients
Deep Reinforcement Learning (Deep RL)
Imitation Learning
Markov Games
Robust Reinforcement Learning
RL Algorithms (e.g., Q-Learning, SARSA, TRPO, PPO)
Offline Reinforcement Learning
Behavior Cloning
Inverse Reinforcement Learning
Equilibria
Robustness

 

Learning Prerequisites

Required courses

Previous coursework in optimization, calculus, linear algebra, and probability is required. Familiarity with optimization is useful. Familiarity with python, and basic knowledge of pytorch deep learning framework is needed.

Recommended courses

EE-556 Mathematics of Data: From Theory to Computation

Important concepts to start the course

Familiarity with optimization algorithms, linear programming and convex duality.

Learning Outcomes

By the end of the course, the student must be able to:

  • Define the key features of RL that distinguishes it from standard machine learning.
  • Assess / Evaluate strengths, limitations and theoretical properties of RL algorithms.
  • Recognize the common, connecting boundary of optimization and RL.
  • Formulate and solve sequential decision-making problems by applying relevant RL tools.

Teaching methods

Lectures are comlemented with Jupiter notebook exercises along with a hands-on group project.

Assessment methods

The students are required to solve Jupiter notebook homeworks. They will work in a group to complete a project on the course and present a poster on the project at the end of the semester.

 

Dans les plans d'études

  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Printemps
  • Forme de l'examen: Pendant le semestre (session d'été)
  • Matière examinée: Reinforcement learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Projet: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel

Semaine de référence

Jeudi, 13h - 15h: Cours GCC330

Jeudi, 15h - 17h: Exercice, TP GCC330

Cours connexes

Résultats de graphsearch.epfl.ch.