BIOENG-210 / 4 crédits

Enseignant: La Manno Gioele

Langue: Anglais

## Summary

Processing, analyzing, and interpreting large biological datasets is an essential skill for modern biologists. This course aims to provide the theoretical foundations, analytical techniques, and software tools necessary to effectively manage and derive insights from complex biological data.

## Content

Biological data types
Probability Distributions in Biology data
Maximum Likelihood estimators for Univariate and Bivariate Distributions
Statistical tests
Multivariate data analysis
Multivariate Linear Regression
Principal Component Analysis (PCA)
Clustering
Priors, Bayes, and Maximum a Posteriori Estimation
Logistic Regression and Classification
Model Selection
Resampling and Simulations
Time series and 1D Signal Processing
ND-image Processing
Generative Models and MCMC

## Keywords

Biological data, statistical learning, probability distributions, maximum likelihood estimation, multivariate analysis, multivariate normal, PCA, SVD, multivariate regression, classification, Bayesian inference, time series, image processing, resampling methods, MCMC.

## Required courses

Analysis, Linear Algebra, Probability and Statistics

## Learning Outcomes

By the end of the course, the student must be able to:

• Analyze multidimensional biological data
• Apply regression and classification models
• Perform model selection
• Use PCA and interpret it
• Visualize multivariate data
• Explore different types of biological data
• Implement basic routines of ML and MAP estimation
• Choose the most appropriate model to specific situation
• Plan an analysis end-to-end
• Interpret statistical tests and posterior distributions

## Teaching methods

Lectures and excercises

## Assessment methods

Written examination at the exam session (70%) and graded exercises (30%).

## Supervision

 Office hours No Assistants Yes Forum Yes

No

## Bibliography

Main:
"Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

"Methods of Multivariate Analysis" By Alvin C. Rencher

Other resources:

"Computer Age Statistical Inference" by Bradley Efron and Trever Hastie

"Data-Driven Science and Engineering" by Steven L. Brunton and J. Nathan Kutz

## Notes/Handbook

Course notes in pdf format

## Dans les plans d'études

• Semestre: Printemps
• Forme de l'examen: Ecrit (session d'été)
• Matière examinée: Biological data science I: statistical learning
• Cours: 2 Heure(s) hebdo x 14 semaines
• Exercices: 2 Heure(s) hebdo x 14 semaines
• Type: obligatoire

## Cours connexes

Résultats de graphsearch.epfl.ch.