Admission requirements
C, C++
Description
During this seminar the fundamentals of audio processing and indexing will be studied. Applications in the area of speech recognition, audio synthesis and content based audio retrieval will be discussed. State of the art work on content based audio retrieval will be studied and presented by the participants.
The seminar starts with several lectures and accompanying assignments in the form of workshops; followed by a literature selection, study, and presentations by all the students; the seminar ends with final project demos / presentations.
Course objectives
At the end of the seminar, students:
Should have a clear understanding of the fundamentals of audio processing and indexing.
Are able to apply the basic audio processing algorithms to sets of audio files and databases.
Have experienced and studied the general setup of a scientific experiment in the field of audio indexing.
Are able to acquire necessary knowledge of state of the art scientific methods in the field of audio indexing by studying scientific publications from journals and proceedings.
Are able to design, implement, execute and report on a scientific audio processing or indexing experiment.
Timetable
The most recent timetable can be found at the students' website
Mode of instruction
Lectures
Seminar
Workshops
Presentations
Projects
Reports
Assessment method
Presentations and Project (60% of grade). Class discussions, attendance, and workshops (40% of grade).
Reading list
Lecture slides and further materials will be made available on the website of the course.
List of recommended books:
Discrete-Time Speech Signal Processing, Principles and Practice by T.F. Quatieri, Prentice Hall PTR; ISBN 013242942, 2002.
Fundamentals of Speech Recognition by Lawrence Rabiner, and Biing-Hwang Juang (Hardcover, 507 pages; Publisher: Pearson Education POD; ISBN: 0130151572; 1st edition, April 12, 1993)
Spoken Language Processing: A Guide to Theory, Algorithm and System Development by Xuedong Huang , Alex Acero , Hsiao-Wuen Hon , Raj Reddy (Hardcover, 980 pages; Publisher: Prentice Hall PTR; ISBN: 0130226165; 1st edition, April 25, 2001)
Speech Recognition: Theory and C++ Implementation by Claudio Bechetti and Lucio Prina Ricotti (Hardcover, 407 pages; Publisher: John Wiley & Sons; ISBN: 0471977306; 1st edition April, 1999)
Registration
You have to sign up for classes and examinations (including resits) in uSis. Check this link for more information and activity codes.
Contact information
Lecturer: Dr. Erwin Bakker
Website: Audio Processing and Indexing