CS-449 / 6 credits

Teacher: Kermarrec Anne-Marie

Language: English


Summary

The course covers fundamental principles for understanding and building systems for managing and analyzing large amounts of data.

Content

Keywords

Distributed systems, Parallel programming, Large-scale storage systems, Large-scale data management

Learning Prerequisites

Required courses

CS-322: Introduction to database systems

Recommended courses

CS-322 Introduction to database systems

CS-323: Introduction to operating systems

CS-206 Parallelism and concurrency

Important concepts to start the course

  • Algorithms and data structures.
  • Scala and/or Java programming languages will be used throughout the course. Programming experience in one of these languages is strongly recommended.
  • Basic knowledge or computer networking and distributed systems

 

Learning Outcomes

By the end of the course, the student must be able to:

  • Choose systems parameters, data layouts, and application designs for database systems and applications.
  • Develop data-parallel analytics programs that make use of modern clusters and cloud offerings to scale up to very large workloads.
  • Analyze the trade-offs between various approaches to large-scala data management and analytics, depending on efficiency, scalability, and latency needs
  • Choose the most appropriate existing systems architecture and technology for a task

Teaching methods

Lectures, exercisesand practical work

Expected student activities

During the semester, the students are expected to:

  • attend the lectures in order to ask questions and interact with the professor,
  • attend the exercises session to solve and discuss exercises,
  • solve practical homeworks and/or finish a project during the semester,
  • take the exams during the semester

Assessment methods

Homeworks, written examinations, project. Continuous control

Supervision

Office hours Yes
Assistants Yes
Forum Yes
Others Office ours by appointment

Resources

Bibliography

Relevant resources (textbook chapters, articles, and videos) posted on moodle page.

In the programs

  • Semester: Spring
  • Exam form: During the semester (summer session)
  • Subject examined: Systems for data science
  • Lecture: 2 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Project: 2 Hour(s) per week x 14 weeks
  • Semester: Spring
  • Exam form: During the semester (summer session)
  • Subject examined: Systems for data science
  • Lecture: 2 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Project: 2 Hour(s) per week x 14 weeks
  • Semester: Spring
  • Exam form: During the semester (summer session)
  • Subject examined: Systems for data science
  • Lecture: 2 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Project: 2 Hour(s) per week x 14 weeks
  • Semester: Spring
  • Exam form: During the semester (summer session)
  • Subject examined: Systems for data science
  • Lecture: 2 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Project: 2 Hour(s) per week x 14 weeks
  • Exam form: During the semester (summer session)
  • Subject examined: Systems for data science
  • Lecture: 2 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Project: 2 Hour(s) per week x 14 weeks
  • Semester: Spring
  • Exam form: During the semester (summer session)
  • Subject examined: Systems for data science
  • Lecture: 2 Hour(s) per week x 14 weeks
  • Exercises: 2 Hour(s) per week x 14 weeks
  • Project: 2 Hour(s) per week x 14 weeks

Reference week

 MoTuWeThFr
8-9     
9-10     
10-11     
11-12     
12-13     
13-14     
14-15     
15-16     
16-17     
17-18     
18-19     
19-20     
20-21     
21-22