Prospectus

nl en

Advances in Data Mining

Course
2022-2023

Admission requirements

Assumed/Recommended prior knowledge

Elementary knowledge of Machine Learning algorithms (classification, regression, clustering, forecasting); common data structures (hash functions, hash tables, dictionaries), statistics. Basic programming skills in Python (including NumPy, Pandas, and plotting packages).

Description

During the course we will cover the most popular algorithms for common data mining tasks: data visualization (PCA, MDS, LLE, t-SNE); classification and regression (RandomForest; XGBoost; SVMs), anomaly detection (LOF, IsolationForest, GenerativeModels, etc); recommender systems (Matrix Factorization) and others. Additionally, we will discuss recent developments in automatic tuning of ML algorithms and mining very big data on distributed systems (Hadoop, Spark) and mining data streams. Additionally, we will discuss in-depth the Kaggle platform as a great source of knowlege and inspiration for future Data Scientists.

Course objectives

After completing the course, the students should:

  • know most successful algorithms and techniques used in Data Mining;

  • gain some hands-on experience with several algorithms for mining complex data sets;

  • be able to apply the acquired knowledge and skills to new problems.

Timetable

The most recent timetable can be found at the Computer Science (MSc) student website.

Mode of instruction

  • Lectures

  • Computer Lab

  • Practical assignments

  • Self-evaluated homework

Assessment method

The final mark is composed of

  • written exam (40%)

  • practical assignments (60%)

In order to pass the course, marks for both components must be at least 5.5.
The teacher will inform the students how the inspection of and follow-up discussion of the exams will take place.

Reading list

  • Several papers and books published on the internet

  • (optional) The Kaggle Book: Data analysis and machine learning for competitive data science (https://www.packtpub.com/product/the-kaggle-book/9781801817479)

Registration

  • You have to sign up for courses and exams (including retakes) in uSis. Check this link for information about how to register for courses.

Contact

Lecturer: Dr. Wojtek Kowalczyk.

Remarks

None.