Information Theory & Optimal Transport

MATH-676 / 2 crédits

Enseignant: Invited lecturers (see below)

Langue: Anglais

Remark: to be given in Spring 2027

Frequency

Only this year

Summary

The course develops some of the mathematical foundations of generative modeling: divergences and metrics from information theory, optimal transport. In order build stable algorithms to learn distribution from finite data proximal regularizations and mean-field game theory are used.

Content

The course develop some of the mathematical foundations needed to understand, build, and improve generative modeling algorithms such as generative adversarial networks, diffusion models, normalizing flows and SchrÃ¶dinger bridges. One of the theme of the course is emphasize the common mathematical thread behind all these different models.Â

We first study the properties and the variational representations of three main mathematical players, Information Divergences (e.g Kullback Leibler-divergence), Optimal Transport Costs (Wassertein metrics), and Integral Probability Metrics (e.g. Kantorovichâ€“Rubinstein metrics and maximum mean discrepancy (kernel distance) and neural integral probability metrics).Â Â These representationsÂ lead to (a) optimizationÂ problems in function spaces (Gibbs variational principle), (b) the construction ofÂ probability flows (Wasserstein gradient flows, Fokker-Planck equation, JKO scheme),Â (c) optimization over probability flows such as in Benamou-Bremier theorem and Schrodinger bridges.Â

A theme of the course is that, in order to build (numerically) stable algorithms to learn distributions from finite data sets, one need to supplement the cost functional from Information theory and Optimal transport with suitable regularizations. This leads to new family of divergences which incorporate the desirable propertiesÂ from different objects, for example information divergence and integral probability metrics or information divergences and optimal transport. One such example is entropic optimal transport andÂ Schroedinger bridges. Another family of examples are new family of divergences which we build using convex analysis methods, in particular proximal regularizations in probability spaces. Another fundamental aspect of generative model is that they can be viewed as control problems and we will develop the detailed connection between generative flows and mean-field games which allows to bring partial differential equationsÂ
(coupled Hamilton-Jacobi and Fokker-Planck equations).

As the class progress we will build the algorithms behind a variety of learning algorithms such as adversarial learning (for generative adversarial networks and probability flows), backward-forward training (for normalizing flows, neural ordinary differential equations, and score diffusion models), proximal algorithms (such as JKO scheme) and Sinkhorn algorithm for SchÃ¶dinger bridges .Â We will emphasize the common mathematical structure behind them.Â

Keywords

Optimal Transport, Information Theory, Proximal Regularizations, Diffusion Models, Generative Flows, Generative Adversarial Networks.

Learning Prerequisites

Required courses

A good working knowledge of analysis, probability theory, and some basic PDEs.

Learning Outcomes

By the end of the course, the student must be able to:

Use the tools information theory and optimal transport to understand generative models.

Resources

Moodle Link

https://go.epfl.ch/MATH-676

Dans les plans d'études

Forme de l'examen: Mémoire (session libre)
Matière examinée: Information Theory & Optimal Transport
Cours: 12 Heure(s)
TP: 12 Heure(s)
Type: optionnel

Semaine de référence

Cours connexes

Résultats de graphsearch.epfl.ch.