nl en

Audio Processing and Indexing


Admission requirements

Not applicable.


During this seminar the fundamentals of audio processing and indexing will be studied. Applications in the area of speech recognition and understanding, audio synthesis and content based audio and music retrieval will be discussed. State of the art work on speech recognition, speech synthesis and content based audio and music retrieval will be studied and presented by the participants.

The seminar starts with several lectures and accompanying assignments in the form of workshops; followed by a literature selection, study, and presentations by all the students; the seminar ends with final project demos / presentations.

Course objectives

At the end of the seminar, the student:

  • Is able to explain and apply the fundamental methods of audio processing, audio indexing, speech synthesis, and speech recognition and understanding.

  • Is able to apply basic audio processing algorithms to audio data and anayse and evaluate their performance.

  • Is able to understand, analyse, evaluate, explain and discuss selected scientific research and experiments in the field of science and technology of audio processing, audio retrieval and spoken language processing.

  • Is able to acquire, analyse and evaluate necessary knowledge of state of the art methods in the field of audio indexing and retrieval by studying scientific publications from journals and proceedings.

  • Is able to create and design, implement, execute and report on a scientific audio processing or indexing experiment.


The most recent timetable can be found at the Computer Science (MSc) student website.

You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudyMap will automatically be displayed in MyTimeTable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Please note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

Mode of instruction

  • Lectures

  • Seminar

  • Workshops

  • Presentations

  • Projects

  • Reports

Course load

Hours of study: 168 (= 6 EC)
Lectures: 26
Practical work: 62
Other: 80

Assessment method

Presentation (20% of the final grade) and Project (40% of the final grade). 4 workshops (each 10% of the final grade, totaling 40% of the final grade).
The teacher will inform the students how the inspection of and follow-up discussion of the work will take place.

Reading list

Lecture slides and further materials will be made available on the website of the course.

List of recommended books:

  • Fundamentals of Speech Recognition by Lawrence Rabiner, and Biing-Hwang Juang (Hardcover, 507 pages; Publisher: Pearson Education POD; ISBN: 0130151572; 1st edition, April 12, 1993)

  • Theory and Applications of Digital Speech Processing by Lawrence Rabiner amd Ronald Schafer, (Pubisher: Pearson, ISBN 0-13-603428-4, 1st edition, 2011).

  • Automatic Speech Recognition: A Deep Learning Approach (Signals and Communication Technology) by Dong Yu and Li Deng, Springer; 2015 edition (November 11, 2014).

  • Deep Learning for NLP and Speech Recognition by Uday Kamath, John Liu, James Whitaker (Springer, 219).


From the academic year 2022-2023 on every student has to register for courses with the new enrollment tool MyStudyMap. There are two registration periods per year: registration for the fall semester opens in July and registration for the spring semester opens in December. Please see this page for more information.

Please note that it is compulsory to both preregister and confirm your participation for every exam and retake. Not being registered for a course means that you are not allowed to participate in the final exam of the course. Confirming your exam participation is possible until ten days before the exam.

Extensive FAQ's on MyStudymap can be found here.


Lecturer: dr. Erwin M. Bakker
Assistant: To be announced.
Website: Audio Processing and Indexing