Prospectus

nl en

Advanced Data Management for Data Analysis

Course
2021-2022

Admission requirements

Assumed/recommended prior knowledge

The course builds on standard (relational) database concepts, techniques, and algorithms --- including Entity-Relationship model, Relational Data model, Relational algebra, SQL, transaction management, storage formats, access structures (indexes), and client-server (software-)architectures - as usually tough in bachelor's level database courses. Familiarity with these, as well as with using database managements systems (DBMS) in practice, is highly recommended for this course.
Given that the course does not (only) aim at learning to use existing systems, but rather (also) at learning to build (parts of) data management and analysis systems, programming experience is system-oriented programming languages like C or C++ are also highly recommended.
Students will use their own desktop or laptop computers for homework / assigments as well as for occasional hands-on sessions during classes. Students should be familiar with installing open-source software on their computer(s).

Description

Going beyond standard "textbook" (relational) database concepts and techniques --- see "Assumed/recommended prior knowledge" above ---, the course discusses state-of-the-art advanced data management concepts and techniques --- including storage models, data structures, algorithms, hardware-conscious implementation techniques, and overall data management system architectures --- to facilitate efficient and scalable analysis of large amounts of data ("Big Data"). Most of these concents and techniques form the basis for leading analytical data management systems, both commercial and open-source.
The course material is based on recent tutorials and publications at leading international scientific venues (journals and conferences).
Implementing selected parts / components of data management and analysis systems is part of the course, mainly as homework / assigments; partly also via occasional hands-on sessions during classes.

Course objectives

The course will teach advanced data management concepts techniques --- including storage models, data structures, algorithms, hardware-conscious implementation techniques, and overall data management system architectures --- to facilitate efficient and scalable analysis of large amounts of data ("Big Data"). Implementing selected parts / components of data management and analysis systems is part of the course.

Timetable

The most recent timetable can be found at the Computer Science (MSc) student website.

Mode of instruction

  • Lectures

  • Assignments / Implementation projects

  • Literature studies including presentations and discussions of selected literature

  • Reports

Course load

Total hours of study: 168 hrs. (= 6 EC)
Lectures: 26:00 hrs.
Practical work/assignments: 69:00 hrs.
Examination: 3:00 hrs.
Self-study: 70:00 hrs.

Assessment method

  • Written/oral exam

  • Homework assignments

  • (Research) project

The teacher will inform the students how the inspection of and follow-up discussion of the exams will take place.

Reading list

To be announced.

Registration

  • You have to sign up for courses and exams (including retakes) in uSis. Check this link for information about how to register for courses.

Contact

Lecturer: prof.dr. S. Manegold

Remarks

None.