MATH-412 / 5 crédits

Enseignant(s): Obozinski Guillaume Romain, Zemel Yoav

Langue: Anglais


Summary

A course on statistical machine learning for supervised and unsupervised learning

Content

  • Introduction: supervised and unsupervised learning, loss functions, train and test errors, bias-variance tradeoff, model complexity and overfitting, linear regression, k-nearest neighbors.
  • Regression: linear regression, model selection, ridge and Lasso.
  • Local averaging methods: k-nearest neighbor, Nadaraya-Waston estimators
  • Simple validation, cross-validation and leave-one-out.
  • Classification: linear discriminant analysis, logistic regression.
  • Nonparametric regression: smoothing splines, reproducing kernel Hilbert spaces.
  • Support vector machines (possibly) and kernelized linear regression.
  • Tree-based methods: classification and regression trees, bagging, random forests.
  • Boosting: AdaBoost, boosted trees.
  • Deep learning: introduction to neural networks (architecture, backpropagation, CNNs)
  • Unsupervised learning: principal component analysis, k-means, Gaussian mixtures and the EM algorithm.

Learning Prerequisites

Required courses

Analysis, Linear Algebra, Probability and Statistics, Linear Models

Important concepts to start the course

This course introduces machine learning with a statistical  and mathematical perspective. Prior to following this course, the student must have very good knowledge of basic probabilty and statistics (statistical modeling and inference, linear regression). Some of the concepts used in course: abstract conditional expectations, multivariate linear regression, convexity, strong convexity, vector space, Hilbert space.

Learning Outcomes

By the end of the course, the student must be able to:

  • Formulate appropriate models for empirical data
  • Estimate the parameters of a statistical model
  • Interpret the fit of a model to data
  • Justify the choice of a model/technique to analyze empirical data
  • Implement statistical learning algorithms
  • Explain the mathematical/statistical mechanisms of most common machine learning algorithms
  • Assess / Evaluate underfitting / overfitting of ML algorithms

Transversal skills

  • Write a scientific or technical report.
  • Make an oral presentation.
  • Demonstrate the capacity for critical thinking
  • Take feedback (critique) and respond in an appropriate manner.
  • Demonstrate a capacity for creativity.

Teaching methods

Ex cathedra lectures, exercises and computer practicals in the classroom and at home, and a project in a group of 3 students.

Assessment methods

Written final exam (70%) + Project of implementation or application on real data of a model/algorithm based on a classical research paper describing an important method from the literature. (30%)

Supervision

Office hours No
Assistants Yes
Forum Yes

Resources

Virtual desktop infrastructure (VDI)

No

Bibliography

  • James, G., Witten, D., Hastie, T. and Tibshirani, R. (2013) An Introduction to Statistical Learning, with Applications in R. Springer.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second edition. Springer.
  • Efron, B. and Hastie, T. (2016) Computer Age Statistical Inference: Algorithms, Evidence and Data Science. Cambridge University Press.
  • Bishop, C. M. (2006) Pattern Recognition and Machine Learning. Springer.
  • Kuhn, M. and Johnson, K. (2013) Applied Predictive Modeling. Springer.
  • Shalev-Shwartz, S. and Ben-David, S. (2014) Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.

Ressources en bibliothèque

Moodle Link

Dans les plans d'études

  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Semestre: Automne
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel
  • Forme de l'examen: Ecrit (session d'hiver)
  • Matière examinée: Statistical machine learning
  • Cours: 2 Heure(s) hebdo x 14 semaines
  • Exercices: 2 Heure(s) hebdo x 14 semaines
  • Type: optionnel

Semaine de référence

Mardi, 13h - 15h: Cours MAA330

Mardi, 15h - 17h: Exercice, TP MAA330

Cours connexes

Résultats de graphsearch.epfl.ch.