Fiches de cours

Computational perception using multimodal sensors


Lecturer(s) :

Odobez Jean-Marc




Every 2 years


Postponed to next spring 2020.


The course will cover perceptual modalities in computers, models for analyzing people (representation, detection an localization, segmentation, tracking, recognition).


1. Perceptual modalities in computers. Vision, hearing, touch, smell. basic fusion principles.
2. Models for analyzing people. introduction to probabilistic graphical models. Basic concepts. Bayesian Networks (BNs). Learning and inference in BNs. Dynamic Bayesian Networks (DBNs). Exact and approximate inference. Examples.
3. Analyzing people. fundamental tasks.
a. Representation. The problem of representation in computational perception. Global vs. local representations. Visual models for faces, heads, hands, and full-bodies (shape/appearance, exemplars, geometric models). Models and features for speech and audio processing.
b. Detection and localization. Basic concepts. Detection as binary classification and as random sampling. Visual localization: skin color modeling, face localization. Audio localization: microphone arrays. Audio-visual fusion for speaker detection.
c. Segmentation. Basic concepts. Visual segmentation: background subtraction. Audio segmentation: source separation, speaker turn segmentation, speaker clustering.
d. Tracking. State space representation. Dynamic modeling. Human motion modeling. Multi-person tracking. Visual, audio and multimodal tracking of people.
e. Recognition. Recognition tasks. Visual recognition: facial expressions, gestures, actions, interaction. Audio recognition: speech, emotion, multi-speaker events. Audio classification. Multimodal recognition: actions.


Artificial perception, human representation, multi-modalities, audio, video, probabilistic model, graphical models.

Learning Prerequisites

Recommended courses

Undergraduate-level knowledge of linear algebra, statistics, image and signal processing.

Assessment methods





Ressources en bibliothèque

In the programs

    • Semester
    • Exam form
    • Credits
    • Subject examined
      Computational perception using multimodal sensors
    • Number of places
    • Lecture
      32 Hour(s)
    • Practical work
      24 Hour(s)

Reference week

      Exercise, TP
      Project, other


  • Autumn semester
  • Winter sessions
  • Spring semester
  • Summer sessions
  • Lecture in French
  • Lecture in English
  • Lecture in German