CS-303 / 6 credits

Teacher: Grossglauser Matthias

Language: English


Summary

This course provides a mathematical treatment of online decision-making. It covers bandits (multi-armed, contextual, structured), Markov Decision Processes (MDPs), and related topics. Key concepts include exploration-exploitation, UCB, Thompson sampling, and tools to derive regret bounds.

Content

  • Decision-making problems in active and reinforcement learning
  • Online learning and prediction
  • Multi-armed bandits model
  • Upper-confidence bound (UCB) and Thompson sampling
  • Contextual bandits
  • Reinforcement learning model
  • Markov Decision Processes (MDP)
  • General decision-making and regret bounds
  • Large state space and function approximation

Learning Prerequisites

Required courses

CS-233 Introduction to machine learning

Recommended courses

COM-102 Advanced information, computation, communication II

COM-300 Modèles stochastiques pour les communications or equivalent

Important concepts to start the course

  • Probability and random processes
  • Data structures and algorithms
  • Information theory

Learning Outcomes

By the end of the course, the student must be able to:

  • Create mathematical models of real-world sequential decision-making scenarios
  • Develop algorithms, reason about efficiency and performance

Teaching methods

  • Ex-cathedra lectures
  • Homework series
  • Labs to develop, simulate, and analyze algorithms

Assessment methods

  • Lab reports
  • Final exam

Supervision

Office hours No
Assistants Yes

Resources

Moodle Link

In the programs

  • Semester: Fall
  • Exam form: Written (winter session)
  • Subject examined: Principles of online decision-making
  • Courses: 2 Hour(s) per week x 14 weeks
  • Exercises: 1 Hour(s) per week x 14 weeks
  • Lab: 1 Hour(s) per week x 14 weeks
  • Type: optional
  • Semester: Fall
  • Exam form: Written (winter session)
  • Subject examined: Principles of online decision-making
  • Courses: 2 Hour(s) per week x 14 weeks
  • Exercises: 1 Hour(s) per week x 14 weeks
  • Lab: 1 Hour(s) per week x 14 weeks
  • Type: optional
  • Semester: Fall
  • Exam form: Written (winter session)
  • Subject examined: Principles of online decision-making
  • Courses: 2 Hour(s) per week x 14 weeks
  • Exercises: 1 Hour(s) per week x 14 weeks
  • Lab: 1 Hour(s) per week x 14 weeks
  • Type: optional
  • Semester: Fall
  • Exam form: Written (winter session)
  • Subject examined: Principles of online decision-making
  • Courses: 2 Hour(s) per week x 14 weeks
  • Exercises: 1 Hour(s) per week x 14 weeks
  • Lab: 1 Hour(s) per week x 14 weeks
  • Type: optional

Reference week

Tuesday, 15h - 17h: Lecture CM1

Wednesday, 16h - 17h: Exercise, TP ELA1

Wednesday, 17h - 18h: Project, labs, other ELA1

Related courses

Results from graphsearch.epfl.ch.