Speech Recognition
Course#: CSCI-GA.3033-015
Instructor: Eugene Weinstein
Grader/TA: Phil Gross
Mailing List
Site/slides credit: Mehryar Mohri

Description

This course gives a computer science presentation of automatic speech recognition, the problem of turning human speech into written transcripts. Over the course of the semester, this class will cover many of the essential algorithms for creating large-scale speech recognition systems. The coverage will include the theoretical and practical aspects of the algorithms and techniques now used in the vast majority of speech recognition systems in both industry and academia. Besides covering classical speech recognition algorithms developed over the past several decades, the course material will also treat a sampling of recent developments in this dynamic field.

Many of the learning and search algorithms and techniques currently used in natural language processing, computational biology, and other areas of application of machine learning were originally designed for tackling speech recognition problems. Speech recognition continues to feed computer science with challenging problems, in particular because of the size of the learning and search problems it generates.

The objective of the course is thus not just to familiarize students with particular algorithms used in speech recognition, but rather use that as a basis to explore general text and speech and machine learning algorithms relevant to a variety of other areas in computer science. The course will allow students to work with real speech recognition systems by making use of several software libraries implementing the algorithms covered in the lectures.

This course is also open to undergraduate students.


Lectures

The following lecture plan covers roughly the planned topics for the course. This list is subject to revision as the semester progresses.


Reading and Software Material

There is no single textbook covering the material presented in this course. The following are some recommended books or papers. An extensive list of recommended papers for further reading is provided in the lecture slides.

Books

Papers Software

Locations and Times

Room 512 Warren Weaver Hall,
251 Mercer Street.
Thursdays 5:10 PM - 7:00 PM.
Instructor office hours: Thursdays 7:00 PM - 8:00 PM, WWH Room 328.
Final Project Presentations: Thursday, December 19th, WWH Room 1314.


Prerequisites

Familiarity with basics in linear algebra, probability, and analysis of algorithms. No specific knowledge about signal processing or other engineering material is assumed. An interest and/or a background in machine learning is helpful.

A working familarity with Linux or other shell-based development environments will be necessary to complete the homework assignments. The assignments and the project will involve working with multiple software libraries with varying degrees of user-friendliness. Students will be expected to be able to work independently through software installation issues and similar challenges in the course of completing the assignments.


Coursework

There will be 3-4 assignments and a final project.

The standard high level of integrity is expected from all students, as with all CS courses.


Homework assignments


Previous years