Prospectus

nl en

Business Analytics

Course
2024-2025

Admission requirements

Only accessible for 3rd year students.

Description

Business Analytics places data mining, visualisation, machine learning and statistics in a business context. If you want to correctly deploy data mining / AI techniques, you must be able to translate a (broadly formulated) question by a customer or a co-worker into an experimental set-up, to make the right choices for the methods you use, and to be able to process the data in the right form to apply those methods. After performing your experiments, you should not only be able to evaluate the results but also interpret and translate it back to the original question (e.g. by visualization) and communicate findings to management of colleagues. Socially, data science is of great importance because the media simplify many data-driven results and statistical research, often making mistakes. Thus, a lot of nonsense comes down on us and it is up to you, the data scientists of the future, to recognize, explain and correct that nonsense. This course is a combination of lectures and practical sessions, in which you take a hands-on approach to solving real-world data science problems in a business context.

Course objectives

A. Knowledge

  • You know different ways to visualize information and data, and which visualisation metaphors to use when.

  • You can explain the following machine learning concepts: supervised learning, unsupervised learning, classification, regression

  • You can list two advantages and two disadvantages of rule-based methods and of machine learning methods

  • You know and can explain the following experimental and statistical principles in your own words: bias, overfitting, cross validation, high-dimensional data, sparseness, dimensionality reduction, feature extraction, class imbalance.

  • You can explain the purpose and principles of feature extraction from semi-structured data, text data, image data, graph data and sensor data.

  • You can explain the difference between engineered features and raw features, in content and in dimensionality.

  • You know and can explain different types of missing data and how to handle them.

  • You know and can explain the use and importance of measuring the quality and reliability of human-labeled data.

  • You can give the definitions of the most important evaluation measures: Accuracy, Mean Squared Error, Precision, Recall, F1 and Mean Average Precision.

  • You know and can explain the benefits and challenges of big data.

  • You know and can explain the principles of responsible data science.

  • You know what explainable AI is and can name some methods that help in analysing ML models.

B. Skills

  • You can recognize statistical nonsense in the media and erroneous visualizations, explain and correct it.

  • After completing the course, you can independently take the steps to set up and execute an experiment within data science, given a (broadly formulated) question:

  1. Task definition: You can create a clear definition of a task based on a general description of a task, consisting of (a) the research question, (b) whether the task is supervised or unsupervised, (c) whether it is a classification, regression or ranking task (or something else), (d) what the data are and (e) what the labels are;
  2. Data collection: If answering the question requests data is not given, then you can define what data you need and how to collect it. If you need explicit labels, you can set up a data annotation task for human raters;
  3. Data exploration: You can collect and visualize statistics about the data. You can calculate and interpret the inter-annotator agreement for annotated data.
  4. Pre-processing and feature extraction: You can write a Python script to read and process the data, extract features and store the feature vectors. You know how to engineer a low-dimensional feature set
  5. Model learning: You can apply unsupervised and supervised models to your data. You know how to make an informed decision on the type of classifier given the feature set. You can generate output for unseen data.
  6. Evaluation: You can correctly set up your model evaluation with a train / test split and cross validation if necessary. You can evaluate your output against human data. You know which evaluation measures you should use given the type of data and model. You can perform significance testing. You can do a sensible error analysis and feature analysis.

Timetable

You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudymap will automatically be displayed in MyTimetable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Pleas note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

Mode of instruction

  • 7 lectures, 2x45 minutes

  • 7 practical sessions, 2x45 minutes)

Assessment method

The assessment of the course consists of a multiple choice exam (60% of course grade) and a practical part (40% of course grade). The practical part is subdivided in two assignments. The weights for the assignments are 15%, 25% respectively for assignment 1 and 2 (of the total grade). The grade for the written exam should be 5.5 or higher in order to complete the course. The weighted average grade for the practical assignments should be 5.5 or higher in order to complete the course. If one of the tasks is not submitted the grade for that task is 0.

The teacher will inform the students how the inspection of and follow-up discussion of the exams will take place.

Reading list

You are expected to read 4 research papers during the course. The papers are announced during the lectures and will be published on Brightspace. The other materials are the course slides and the practical session instructions.

Registration

For application EduXchange is used, application will start on Wednesday 15th of May 2024 at 13:00h.

Application period:

TU Delft, Erasmus and LDE students: 15 May 2024 (at 13.00h) - 31 May 2024

Leiden University students: 15 May 2024 (at 13.00h) - 4 July 2024

More information about the application procedure can be found on this website:

Application procedure

This course can only be followed as part of the Artificial Intelligence, Business and Information minor and Computer Science and Economics.

Contact

Niki van Stein

Remarks

NA