CSCI 780/3813: Spoken Language Processing

Fall 2011
Monday Wednesday 3:05pm - 4:20pm
NSB B141
Instructor: Andrew Rosenberg (andrew_at_cs.qc.cuny.edu)
Office Hours: Wednesday 12-1pm NSB A330

Overview

This class will teach the foundations of spoken language processing. We will build from the fundamentals of modern speech recognition and speech synthesis. This work will be motivated by spoken dialog sytems and conversational agents as examples of machine speech perception and production.
By the end of the semester, students will have a basic knowledge of the core problems and state-of-the-art in spoken language processing, and be prepared to participate in a semester-long spoken language processing seminar course.

Class Policy

Come to Class. It will be difficult to do well in the class without regular attendance. There is no additional penalty for missing class.

Cell phones must be on silent, and are not to be checked or used during class - if you are expecting an urgent call, tell the instructor at the start of class.

Laptops are fine for taking notes. No internet, no chat, no games.

Cell phone and Laptop policy: One warning, after that 5 points off the next homework or exam for each issue. Same policy for the instructor. One warning, after that, everyone gets 5 points on the next homework or exam.

Grading Policy

Midterm: 25%

Project in 5 parts: 75% (35% final project, plus 10% per milestone assignment)

The Final Letter Grade will be based on a scaled adjustment of the Final Numeric Grade. When the scale has been determined, the class will be informed either in class or over email, and it will be posted to the course webpage (here).

Assignment Policy

Do not cheat. You may discuss assignments with your classmates, but write or program your assignment alone. Do not ask for or offer to share code, or written assignments. If you discuss an assignment with a classmate, or on an online forum, include the name of the classmate or URL of the forum on your assignment or in the documentation of your code. The first instance of cheating results in an automatic zero (F) for the course. The Computer Science Department will be notified in writing of all instances of cheating and a report will be submitted to the Office of Academic Integrity.

Assignments will be posted to the website (here) after class on Tuesdays.

All assignments will be scored out of 100 points.

Assignments will be due just after the start of class, 3:10pm. Written assignments should be emailed or hard-copies should be delivered in class.

Deliver assignments at the start of class or email with a timestamp before 3:10pm to avoid a late penalty. If an extension is needed let me know as soon as possible. I will do my best to be reasonable to you and fair to the rest of class. Delivering an assignment while being more than 5 minutes late for class will be make the assignment considered Late. There is a 5 point Late Penalty for each 12 hours late the assignment is delivered. Due Date (DD) 3:10pm - DD+1 3:10am -5 points. DD+1 3:10am - DD+1 3:10pm -10 pts. DD+1 3:10pm - DD+2 3:10am -15pts. DD+2 3:10am - DD+2 3:10pm -20pts.

After 6:30, 2 days after an assignment was due, no assignments will be accepted.

After each assignment and the midterm is graded, anonymous mean, median, maximum and minimum scores will be distributed to the class.

Coding Assignments

Assignments will be written in C++, java or python and must compile using g++ on venus.cs.qc.cuny.edu with no unsubmitted, external libraries.

In general, grading will be 15% Compilation, 15% Execution, 35% Correctness (15% passing tests, 20% implementational details), 35% Documentation and Style. This may be adjusted for some assignments. Always read the assignment for the grading breakdown.

Testing will be performed automatically. Sample tests will be delivered with each assignment. If code does not operate using the published and distributed testing format, the assignment will be considered Incorrect and zero "Correctness" points will be awarded.

Detailed requirements will accompany each assignment. The instructions and requirements on a particular assignment always take precedence over the general guidelines on the course website.

Submission of coding assignments should be performed over email. Don't forget to attach your files. Submitting multiple times is fine. The latest assignment submitted on time will be graded. If you submit an assignment late, after submitting an assignment on time, you must let me know, via email, that you would like the late submission graded for the assignment.

Written Assignments

Written Assignments can be delivered electronically by email, or hard copies can be delivered by hand either in class, or dropped at my office NSB A330.

Electronic copies must be in one of the following formats: .pdf, Microsoft Word .doc, Google Docs, OpenOffice.

Hand written copies are acceptable, but be very careful that the work is clear. If I can't read that an answer is correct, it is wrong.

Points for each question will be described in each assignment.

Exam Policy

The Midterm will be held during class on November 2nd. If you will not be able to make this date, let me know as early as possible, and I will do my best to schedule another time for you to take the Midterm.

Text Book

Speech and Language Processing, Author: Jurafsky and Martin, Publisher: Prentice Hall, Edition: 2 ISBN: 978-0131873216

This should be available through the bookstore, but may be found through other outlets at a discount.

Schedule

Date Material Assignments
August 29 No Class.
August 31 No Class.
September 5 No Class - Labor Day
September 7 Course Overview
September 12 From Sounds to Language Homework 1 Assigned
Text file with nonsense IPA symbols
September 14 Spoken Dialog Systems
September 19 Acoustics of Speech
September 21 Speech Recognition Overview
September 26 Fast Fourier Transform Homework 1 Due. Homework 2 Assigned.
September 28 No Class - Rosh Hashannah
October 3 Speech Representations - MFCC
October 5 Introduction to Statistical Modeling and Machine Learning
October 10 No Class - Columbus Day
October 12 Acoustic modeling - Gaussian Mixture Model
October 17 Sequential modeling - Hidden Markov Model Homework 2 Due.
October 19 Pronunciation Modeling Homework 3 Assigned
October 24 Language Modeling - N-grams
October 26 Human Speech Perception
October 31 Slack day and Midterm review
November 2 In-class Midterm Exam Homework 3 Due.
November 7 Introduction to Prosody
November 9 ``Rich'' Transcription Homework 4 Assigned.
November 14 Modeling Prosody
November 16 Modeling Prosody Part II.
November 21 Introduction to Speech Synthesis
Guest Lecture by Raul Fernandez (IBM)
November 23 Predicting Prosody from Text Homework 4 Due.
November 28 Text Normalization Homework 5 Assigned.
November 30 User Interaction
December 5 User Interaction ctd.
December 7 Machine Learning in Spoken Language Processing Homework 5 Due. Final Project Writeup Description
December 12 Spoken Dialog System Working Session
December 21 @ 1:45 - 3:45 Spoken Dialog System Demos. Note different start time. Final Project Due. [No extensions will be granted]