Distributed information systems
Summary
This course introduces the foundations of information retrieval, data mining and knowledge bases, which constitute the foundations of today's Web-based distributed information systems.
Content
Information Retrieval
- Information Retrieval - Introduction
- Text-Based Information Retrieval (Boolean, Vector space, probabilistic)
- Inverted Files
- Distributed Retrieval
- Query Expansion
- Embedding models (LSI, word2vec)
- Link-Based Ranking
Mining Unstructured Data
- Document Classification (knn, Naive Bayes, Fasttext, Transformer models)
- Recommender Systems (collaborative filtering, matrix factorization)
- Mining Social Graphs (modularity clustering, Girvan-Newman)
Knowledge Bases
- Semantic Web
- Keyphrase extraction
- Named entity recognition
- Information extraction
- Taxonomy Induction
- Entity Disambiguation
- Label Propagation
- Link Prediction
Learning Prerequisites
Recommended courses
Introductory courses to databases and machine learning are helpful, but not required.
Programming skills in Python are helpful, but not required.
Learning Outcomes
By the end of the course, the student must be able to:
- Characterize the main tasks performed by information systems, namely data, information and knowledge management
- Apply collaborative information management models, like crowd-sourcing, recommender systems, social networks
- Apply knowledge models, their representation through Web standards and algorithms for storing and processing semi-structured data
- Apply fundamental models and techniques of text retrieval and their use in Web search engines
- Apply main categories of data mining techniques, local rules, predictive and descriptive models, and master representative algorithms for each of the categories
Teaching methods
Ex cathedra + programming projects (Python)
Assessment methods
60% Continuous evaluations of projects with bonus system during the semester
40% Final written exam (180 min) during exam session
Dans les plans d'études
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Semestre: Automne
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel
- Forme de l'examen: Ecrit (session d'hiver)
- Matière examinée: Distributed information systems
- Cours: 2 Heure(s) hebdo x 14 semaines
- Exercices: 1 Heure(s) hebdo x 14 semaines
- Projet: 1 Heure(s) hebdo x 14 semaines
- Type: optionnel