- français
- English
Fiches de cours
Systems for data science
CS-449
Lecturer(s) :
Kermarrec Anne-MarieLanguage:
English
Summary
The course covers fundamental principles for understanding and building systems for managing and analyzing large amounts of data.Content
Big data systems design and implementation :
- Distributed systems for data science
- Data management : locality, accesses, partioning, replication
- Distributed Machine Learning Systems : federated learning/parameter server/decentralized learning
- Massively parallel processing operations
Large-scale storage systems :
- Data structures : File systems, Key-value stores, DBMS
- Concurrent access to data
- Consistency models. The CAP theorem. NoSQL and NewSQL systems
- Transactions
Large-scale processing :
- Parallel processing
- Streaming Processing
- Online Processing
- 'Graph Processing
Keywords
Distributed systems, Parallel programming, Large-scale storage systems, Large-scale data management
Learning Prerequisites
Required courses
CS-322: Introduction to database systems
Recommended courses
CS-323: Introduction to operating systems
CS-206 Parallelism and concurrency
Important concepts to start the course
- Algorithms and data structures ' sorting algorithms, balanced trees, graph traversals.
- The Scala programming language will be used throughout the course. Programming experience in this language is strongly recommended.
- Basic knowledge or computer networking and distributed systems
Learning Outcomes
By the end of the course, the student must be able to:- Choose systems parameters, data layouts, query plans, and application designs for database systems and applications.
- Develop data-parallel analytics programs that make use of modern clusters and cloud offerings to scale up to very large workloads.
- Analyze the trade-offs between various approaches to large-scala data management and analytics, depending on efficiency, scalability, and latency needs
- Choose the most appropriate existing systems architecture and technology for a task
Teaching methods
Ex cathedra; including exercises in class, practice with pen and paper or with a computer, and a project
Expected student activities
During the semester, the students are expected to:
- attend the lectures in order to ask questions and interact with the professor,
- attend the exercises session to solve and discuss exercises,
- solve practical homeworks and/or finish a project during the semester,
- take a midterm
- take a final exam
Assessment methods
Homeworks, written examinations, project. Continuous control
Supervision
Office hours | Yes |
Assistants | Yes |
Forum | Yes |
Others | Office ours by appointment |
Resources
Bibliography
Relevant resources (textbook chapters, articles, and videos) posted on moodle page.
In the programs
- SemesterSpring
- Exam formDuring the semester
- Credits
6 - Subject examined
Systems for data science - Lecture
2 Hour(s) per week x 14 weeks - Exercises
2 Hour(s) per week x 14 weeks - Project
2 Hour(s) per week x 14 weeks
- Semester
- SemesterSpring
- Exam formDuring the semester
- Credits
6 - Subject examined
Systems for data science - Lecture
2 Hour(s) per week x 14 weeks - Exercises
2 Hour(s) per week x 14 weeks - Project
2 Hour(s) per week x 14 weeks
- Semester
- SemesterSpring
- Exam formDuring the semester
- Credits
6 - Subject examined
Systems for data science - Lecture
2 Hour(s) per week x 14 weeks - Exercises
2 Hour(s) per week x 14 weeks - Project
2 Hour(s) per week x 14 weeks
- Semester
- SemesterSpring
- Exam formDuring the semester
- Credits
6 - Subject examined
Systems for data science - Lecture
2 Hour(s) per week x 14 weeks - Exercises
2 Hour(s) per week x 14 weeks - Project
2 Hour(s) per week x 14 weeks
- Semester
- SemesterSpring
- Exam formDuring the semester
- Credits
6 - Subject examined
Systems for data science - Lecture
2 Hour(s) per week x 14 weeks - Exercises
2 Hour(s) per week x 14 weeks - Project
2 Hour(s) per week x 14 weeks
- Semester
- SemesterSpring
- Exam formDuring the semester
- Credits
6 - Subject examined
Systems for data science - Lecture
2 Hour(s) per week x 14 weeks - Exercises
2 Hour(s) per week x 14 weeks - Project
2 Hour(s) per week x 14 weeks
- Semester
Reference week
Mo | Tu | We | Th | Fr | |
---|---|---|---|---|---|
8-9 | |||||
9-10 | |||||
10-11 | ELA2 | ||||
11-12 | |||||
12-13 | |||||
13-14 | BS160 | ||||
14-15 | INR219 | ||||
15-16 | |||||
16-17 | |||||
17-18 | |||||
18-19 | |||||
19-20 | |||||
20-21 | |||||
21-22 |
Lecture
Exercise, TP
Project, other
legend
- Autumn semester
- Winter sessions
- Spring semester
- Summer sessions
- Lecture in French
- Lecture in English
- Lecture in German