Coursebooks 2017-2018

PDF
 

Applied data analysis

CS-401

Lecturer(s) :

West Robert

Language:

English

Summary

This course teaches the basic techniques and practical skills required to make sense out of a variety of data, with the help of the most acclaimed software tools in the data science world: pandas, scikit-learn, Spark, etc.

Content

Thanks to a new breed of software tools that allows to easily process and analyze data at scale, we are now able to extract invaluable insights from the vast amount of data generated daily. As a result, both the business and scientific world are undergoing a revolution which is fueled by one of the most sought after job profiles: the data scientist.

This course covers the fundamental steps of the data science pipeline:

Data Acquisition

Data Wrangling

Data Interpretation

 Data Visualization

Reporting

 

The students will learn the techniques during the ex-cathedra lectures, and will then get familiar with the software tools to complete the homework assignments (which will be in part executed under the supervision of the teacher and the assistants, during the lab hours).

In parallel, the students will embark in a semester-long project, split in agile teams of 3. The outcome of such team efforts will be unified towards the end of the course, to build a project portfolio that will be made public (and available as open-source).

At the end of the semester, students will also take a 3-hour final exam in a classroom with computers, where they will be asked to complete a data analysis pipeline (both with code and extensive comments) on a dataset they have never worked with before.

Keywords

data science, data analysis, data mining, machine learning

 

Learning Prerequisites

Required courses

The student MUST have passed an introduction to databases course, OR a course in probability & statistics, OR two separate courses that include programming projects.

 

Recommended courses

 

Important concepts to start the course

Algorithms, object oriented programming, basic probability and statistics 

 

Learning Outcomes

By the end of the course, the student must be able to:

Transversal skills

Teaching methods

 

Expected student activities

Students are expected to: 

 

Assessment methods

 

Supervision

Office hours Yes
Assistants Yes
Forum Yes
Others http://ada.epfl.ch

Resources

Virtual desktop infrastructure (VDI)

No

Websites

In the programs

Reference week

 MoTuWeThFr
8-9   SG1 
9-10    
10-11     
11-12     
12-13     
13-14    BCH 2201
14-15    
15-16     
16-17     
17-18     
18-19     
19-20     
20-21     
21-22     
 
      Lecture
      Exercise, TP
      Project, other

legend

  • Autumn semester
  • Winter sessions
  • Spring semester
  • Summer sessions
  • Lecture in French
  • Lecture in English
  • Lecture in German