Principles of online decision-making
Summary
This course provides a mathematical treatment of online decision-making. It covers bandits (multi-armed, contextual, structured), Markov Decision Processes (MDPs), and related topics. Key concepts include exploration-exploitation, UCB, Thompson sampling, and tools to derive regret bounds.
Content
- Decision-making problems in active and reinforcement learning
- Online learning and prediction
- Multi-armed bandits model
- Upper-confidence bound (UCB) and Thompson sampling
- Contextual bandits
- Reinforcement learning model
- Markov Decision Processes (MDP)
- General decision-making and regret bounds
- Large state space and function approximation
Learning Prerequisites
Required courses
CS-233 Introduction to machine learning
Recommended courses
COM-102 Advanced information, computation, communication II
COM-300 Modèles stochastiques pour les communications or equivalent
Important concepts to start the course
- Probability and random processes
- Data structures and algorithms
- Information theory
Learning Outcomes
By the end of the course, the student must be able to:
- Create mathematical models of real-world sequential decision-making scenarios
- Develop algorithms, reason about efficiency and performance
Teaching methods
- Ex-cathedra lectures
- Homework series
- Labs to develop, simulate, and analyze algorithms
Assessment methods
- Lab reports
- Final exam
Supervision
Office hours | No |
Assistants | Yes |
In the programs
- Semester: Fall
- Exam form: Written (winter session)
- Subject examined: Principles of online decision-making
- Courses: 2 Hour(s) per week x 14 weeks
- Exercises: 1 Hour(s) per week x 14 weeks
- Lab: 1 Hour(s) per week x 14 weeks
- Type: optional
- Semester: Fall
- Exam form: Written (winter session)
- Subject examined: Principles of online decision-making
- Courses: 2 Hour(s) per week x 14 weeks
- Exercises: 1 Hour(s) per week x 14 weeks
- Lab: 1 Hour(s) per week x 14 weeks
- Type: optional
- Semester: Fall
- Exam form: Written (winter session)
- Subject examined: Principles of online decision-making
- Courses: 2 Hour(s) per week x 14 weeks
- Exercises: 1 Hour(s) per week x 14 weeks
- Lab: 1 Hour(s) per week x 14 weeks
- Type: optional
- Semester: Fall
- Exam form: Written (winter session)
- Subject examined: Principles of online decision-making
- Courses: 2 Hour(s) per week x 14 weeks
- Exercises: 1 Hour(s) per week x 14 weeks
- Lab: 1 Hour(s) per week x 14 weeks
- Type: optional