# Research methods in AI

Course
2023-2024

Statistics for Computer Scientists or a similar statistical course

## Description

If you want to demonstrate that your product is better than that of other people, that your algorithm can detect to which class someone belongs, or that your magical pill works, you will need the data to show it. And to prove that your findings are not just due to coincidence, you would probably do some kind of test for statistical significance. But in search of statistically significant results, things can go wrong.

In this course we will discuss how you can analyse and present your results in a scientific way, which means that they can be trusted and verified by other researchers. We will also talk about what can go wrong when you don’t stick to these good research practices.

The course starts with an introduction to the statistical programming language R, which is the most popular programming language among statisticians. We will discuss several statistical methods and approaches to optimally visualize the data. Next, we will focus on where things can go wrong. These “questionable research practices” are an important reason that many scientific studies cannot be reproduced and replicated. We will explain what these practices (e.g., p-hacking) entail, why researchers might use them, and, importantly, how one could avoid these problems. One solution involves embracing open science, which is about sharing your data and your code for analyses, so others can check what you have done. It’s even better if you can fix and register your analysis plans before collecting your data, which is called preregistration. An alternative is to do a multiverse analysis, in which you make a list of ways in which you could analyse your data, and then just do them all. You will use simulations and analyses in R to apply these concepts.
In the end, this course should help you think in a critical way about scientific publications and it should help you make the right choices in your own research projects.

## Course objectives

By the end of the course, students can:

• Simulate, analyse, and present data in R

• Discuss the effects of questionable research practices on type I error rate

• Understand the concepts of reproducibility and replicability

• Apply the principles of open science, and perform a multiverse analysis

• Think critically about the scientific effects of their own cognitive biases and those of other researchers

## Timetable

You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudyMap will automatically be displayed in MyTimeTable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Please note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

## Mode of instruction

Each week will have a two-hour lecture and a mandatory two-hour work group session.

## Assessment method

The course grade is the weighted average of the theory exam grade (40%) and the assignment grade (60%). The theory grade (40%) results from a closed-book multiple choice exam covering the theoretical knowledge discussed in the lectures and work group sessions. Students have the opportunity to retake the exam, if their theory grade is below 5.5. If, after the resit, the theory grade is (still) below 5.5, the student needs to retake the theory part of the course (i.e., the lectures).
The assignment grade (60%) is the average grade for two group assignments that require performing simulations and analyses in R, and interpreting the results. Presence in the work group meetings is mandatory, because students will work on group projects. Students can compensate for up to two missed sessions by doing extra assignments for those weeks. Students only receive an assignment grade, if the attendance requirement is met. There is no resit opportunity for the assignment grade. If the assignment grade is below 5.5 or the attendance requirement is not met, students need to retake the practical part of the course (i.e., work groups + assignments).

Course material includes slides, exercises, and articles that will be made available via the online course platform.

## Registration

From the academic year 2022-2023 on every student has to register for courses with the new enrollment tool MyStudymap. There are two registration periods per year: registration for the fall semester opens in July and registration for the spring semester opens in December. Please see this page for more information.

Please note that it is compulsory to register for every exam and retake. Not being registered for a course means that you are not allowed to participate in the final exam of the course.

Extensive FAQ on MyStudymap can be found here.

## Contact

Education coordinator LIACS bachelors

Not applicable.