MGT-502 / 5 credits

Teacher(s): Thurm Boris, Vlachos Michail

Language: English

Withdrawal: It is not allowed to withdraw from this subject after the registration deadline.


Summary

This class provides a hands-on introduction to data science and machine learning topics, exploring areas such as regression, classification, clustering, dimensionality reduction, text analytics, and neural networks. The course consists of lectures and coding sessions using the Python language.

Content

Keywords

Data science, Machine learning, Algorithms, Regression, Classification, Dimensionality reduction, Clustering, Neural networks, Text analytics, Python

Learning Prerequisites

Required courses

Statistics and data science (MGT-499)

Important concepts to start the course

  • Basic Probability and Statistics knowledge (random variables, expectation, mean, conditional and joint distribution, independence, Bayes rule, central limit theorem)
  • Basic linear algebra (matrix/vector multiplication, system of linear equations)
  • Multivariate calculus (derivative w.r.t. vector and matrix variables)
  • Basic programming skills (Python)

Learning Outcomes

By the end of the course, the student must be able to:

  • Describe the principal types of machine learning algorithms
  • Investigate data, data types, and problems with the data
  • Choose an appropriate Machine Learning method for a given task
  • Implement Machine Learning algorithms in Python
  • Optimize the main tradeoffs such as overfitting and computational cost vs accuracy
  • Conduct a Data Science project

Transversal skills

  • Plan and carry out activities in a way which makes optimal use of available time and other resources.
  • Demonstrate the capacity for critical thinking
  • Access and evaluate appropriate sources of information.
  • Use a work methodology appropriate to the task.

Teaching methods

  • Lectures
  • Lab sessions: coding exercices
  • Data Science projects

Expected student activities

The students are expected to:

  • attend lectures and lab sessions;
  • work on the weekly theory and coding exercises;
  • complete assignments (graded);
  • conduct data science projects making use of the theory learned during lectures and code developed during lab sessions (graded)

Assessment methods

  • Quizzes: 20%
  • Coding assignments: 40%
  • Project: 40%

Supervision

Office hours No
Assistants Yes
Forum No
Others Slack channel

Resources

Virtual desktop infrastructure (VDI)

No

Bibliography

  • [not mandatory] Introduction to Machine Learning, by Etienne Bernard (2022)
  • [not mandatory] Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, by Foster Provost and Tom Fawcett

Ressources en bibliothèque

Notes/Handbook

Slides will be made available on the course Moodle page. Notebooks will be made available in a GitHub repository.

Moodle Link

In the programs

  • Semester: Spring
  • Number of places: 40
  • Exam form: During the semester (summer session)
  • Subject examined: Data science and machine learning
  • Lecture: 3 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Semester: Spring
  • Number of places: 40
  • Exam form: During the semester (summer session)
  • Subject examined: Data science and machine learning
  • Lecture: 3 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks

Reference week

 MoTuWeThFr
8-9     
9-10     
10-11     
11-12     
12-13     
13-14     
14-15     
15-16     
16-17     
17-18     
18-19     
19-20     
20-21     
21-22