Coursebooks

Applied data analysis

CS-401

Lecturer(s) :

West Robert

Language:

English

Summary

This course teaches the basic techniques and practical skills required to make sense out of a variety of data, with the help of the most acclaimed software tools in the data science world: pandas, scikit-learn, Spark, etc.

Content

Thanks to a new breed of software tools that allows to easily process and analyze data at scale, we are now able to extract invaluable insights from the vast amount of data generated daily. As a result, both the business and scientific world are undergoing a revolution which is fueled by one of the most sought after job profiles: the data scientist.

This course covers the fundamental steps of the data science pipeline:

Data Acquisition

Data Wrangling

Data Interpretation

 Data Visualization

Reporting

 

The students will learn the techniques during the ex-cathedra lectures and will be introduced, in the lab sessions, to the software tools required to complete the homework assignments and the in-class quizzes.

In parallel, the students will embark on a semester-long project, split in agile teams of three. The outcome of this team effort will be a project portfolio that will be made public (and available as open source).

At the end of the semester, students will also take a 3-hour final exam in a classroom with their own computer, where they will be asked to complete a data analysis pipeline (both with code and extensive comments) on a dataset they have never worked with before.

Keywords

data science, data analysis, data mining, machine learning

 

Learning Prerequisites

Required courses

The student must have passed an introduction to databases course, OR a course in probability & statistics, OR two separate courses that include programming projects.

 

Recommended courses

 

Important concepts to start the course

Algorithms, (object-oriented) programming, basic probability and statistics 

Learning Outcomes

By the end of the course, the student must be able to:

Transversal skills

Teaching methods

 

Expected student activities

Students are expected to: 

 

Assessment methods

 

Supervision

Office hours Yes
Assistants Yes
Forum Yes
Others http://ada.epfl.ch

Resources

Virtual desktop infrastructure (VDI)

No

Websites

In the programs

Reference week

 MoTuWeThFr
8-9   SG1 
9-10    
10-11     
11-12     
12-13     
13-14    BCH2201
14-15    
15-16     
16-17     
17-18     
18-19     
19-20     
20-21     
21-22     
 
      Lecture
      Exercise, TP
      Project, other

legend

  • Autumn semester
  • Winter sessions
  • Spring semester
  • Summer sessions
  • Lecture in French
  • Lecture in English
  • Lecture in German