VLG Group
Group Meetings
Y. LeCun's website
CS at Courant
Courant Institute

Machine Learning and Pattern Recognition: Schedule

[ Course Homepage | Schedule and Course Material | Mailing List ]

This page contains the schedule, slide from the lectures, lecture notes, reading lists, assigments, and web links.

I urge you to download the DjVu viewer and view the DjVu version of the documents below. They display faster, are higher quality, and have generally smaller file sizes than the PS and PDF.

Full-text search is provided for the entire collection of slides and papers. Click here to search

01/21: Introduction and basic concepts

Subjects treated: Intro, types of learning, nearest neighbor, how biology does it, linear classifier, perceptron learning procedure, linear regression, training/test, capacity, overfitting, regularization, Occam's Razor, MDL.

Slides: [DjVu | PDF | PS]

Required Reading:

  • Hastie/Tibshirani/Friedman: Chapter 2

Optional Reading:

  • Refresher on random variables and probabilites by Andrew Moore: (slides 1-27) [DjVu | PDF]
  • Refresher on joint probabilities, Bayes theorem by Chris Willams: [DjVu | PDF]
  • Refresher on statistics and probabilities by Sam Roweis: [DjVu | PS]
  • If you are interested in the early history of self-organizing systems and cybernetics, have a look at this book available from the Internet Archive's Million Book Project: Self-Organizing Systems, proceedings of a 1959 conference edited by Yovits and Cameron (DjVu viewer required for full text).

01/28: Probability Theory, Bayes Inversion, Bayes Decision Rule

Subjects treated: Refreshers on probability theory. Bayes decision rule, naive Bayes classifier, logistic regression. Refresher on multivariate calculus and optimization.


Required Reading:

  • Hastie/Tibshirani/Friedman: Sections 3.1, 3.2, 3.4.1 to 3.4.3, 4.1, 4.2, 4.4, 4.5

Optional Reading:

  • Duda/Hart/Stork: Sections 5.1 to 5.8
  • Bishop: Chapter 3
  • Hastie/Tibshirani/Friedman: Sections 3.3, 4.3
  • Paper on logistic regression by Michael Jordan: [DjVu | PS]

02/04: MLE, MAP, Energy Functions

Subjects treated: Bayesian Estimation, Maximum Likelihood Estimation, MAP Estimation, Loss Functions and Energy-Based models. Probability, Entropy, Energy, and Free Energy. Introduction to Lush.

Slides: [DjVu | PDF | PS]

Required Reading:

Optional Reading:

  • Tutorial on Lush: here

Homework Assignements: (see next lecture)

02/11: Gradient-Based Learning I: Beyond Linear Classifiers

Subjects treated: Intro to Gradient-Based Learning. Limitations of linear classifiers. Basis function expansion, polynomial classifiers, kernel expansion, RBF Networks, Simple multi-layer neural nets. Optimization and the convergence of gradient-based learning.


Required Reading:

  • Gradient-based Learning Applied to Document Recognition by LeCun, Bottou, Bengio, and Haffner, pages 1-5 (Introduction): [ DjVu | .ps.gz ]
  • Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 1-5: [ DjVu | .ps.gz ]

Homework Assignements: implementing the Perceptron Algorithm, MSE Classifier (linear regression), Logistic Regression. Details and datasets below:

  • Download this tar.gz archive. It contains the datasets and the homework description.
  • Decompress it with "tar xvfz homework-01.tgz" on Unix/Linux or with Winzip in Windows.
  • The file homework01.txt contains the questions and instructions.
  • Most the of the necessary Lush code is provided.
  • Due Date is Wednesday March 3, before the lecture.

02/18: Gradient-Based Learning II: Multilayer Networks and Back-Propagation

Subjects treated: Multi-Module learning machines. Vector modules and switches. Multilayer neural nets. Backpropagation Learning.


02/25: Gradient-Based Learning III: Special Architectures

Subjects treated: Special architectures: RBF nets, mixtures of experts, parameter-space transforms. Implementation and practical issues with multi-module/multi-layer learning machines. Intro to convolutional nets.


Required Reading:
  • Convolutional nets: "Gradient-based Learning Applied to Document Recognition" by LeCun, Bottou, Bengio, and Haffner, pages 5-18 (up to and including section IV-B ): [ DjVu | .ps.gz ]
  • On the Lagrangian formulation of gradient-based learning: "A theoretical framework for back-propagation": [ DjVu | .ps.gz ]
  • Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 6-end: [ DjVu | .ps.gz ]

Optional Reading:

  • Multimodule Approach and Lagrangian formulation: "a framework for the cooperation of learning algorithms" by Bottou and Gallinari: DjVu.

03/03: Convolutional Nets. Cross-Validation, Model Selection, Learning Theory

Subjects treated: Invariant Recognition, Feature Learning, Convolutional Networks and Time-Delay Neural Nets. Model Selection, Cross-Validation, VC-dimension, Structural Risk Minimization, Bagging.


Homework Assignements: implementing Gradient-Based Learning and back-propagation. You must implement gradient-based learning using the object-oriented, module-based approach as described in class. Various architectures, including a multilayer neural net, must be implemented and tested on two datasets.

  • Download this tar.gz archive. It contains the datasets and the homework description.
  • Decompress it with "tar xvfz homework-01.tgz" on Unix/Linux or with Winzip in Windows.
  • The file homework-02.txt contains the questions and instructions.
  • Most of the necessary Lush code is provided.
  • Due Date is Friday April 2 (NEW NEW DATE!).

03/10: Unsupervised Learning

Subjects treated: Unsupervised Learning: Principal Component Analysis. Density Estimation: Parzen Windows, Mixtures of Gaussians, Auto-Encoders. Latent variables and the Estimation-Maximization algorithm.



Spring break: NO CLASS.

03/24: Guest Lecture by Prof. Lawrence Saul: Dimensionality Reduction

Subjects treated: Non-Linear Dimensionality Reduction and Embedding: Guest Lecture by Prof. Lawrence Saul from University of Pennsylvania:


  • L. Saul's Lecture Slides on non-linear dimensionality reduction (caution: the PS and the PDF are over 25MB, the DjVu is 2MB): [DjVu | PDF | PS]

Required Reading: (please read this before the class)

  • L. K. Saul and S. T. Roweis (2003). Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 4:119-155. [PDF].

Optional Reading:

03/31: Efficient Optimization, Latent Variables, Graph Transformer Networks

Subjects treated: Efficient learning: conjugate gradient, Levenberg-Marquardt. Lagrange Multipliers and Constrained Optimization.

More on latent variables and EM.

Modeling distributions over sequences. Learning machines that manipulate graphs. Finite-state transducers. Graph Transformer Networks.

Required Reading:

Homework Assignements: Homework 03: K-Means and Mixture of Gaussians estimation with EM.

  • The subject of this homework is to implement the K-means algorithm and the Expectation-Maximization algorithm for a Mixture of Gaussians model. The algorithms must be tested on image data for simulated image compression taks.
  • Download this tar.gz archive. It contains the datasets and the homework description.
  • Decompress it with "tar xvfz homework-03.tgz" on Unix/Linux or with Winzip in Windows.
  • The file homework-03.txt contains the questions and instructions.
  • DUE DATE: Friday April 16

04/07: Boosting and Support Vector Machines

This lecture will be given by Prof. Dan Melamed.

Subjects treated: Boosting, and Ensemble Methods. Maximum Margin Classifiers. Support Vector Machines, Kernel Machines.

Homework Assignements: Final Project

  • A list of possible project topics is available here. Make a proposal (send an email message to me and to the TA).
  • This project will count for a lot in the final grade.
  • Collaboration: you can do your final project in groups of two students.
  • Due Date: Friday, May 14. Extensions may be granted for ambitious projects by students who are not graduating this year. if you intend to graduate this year, you must return your project by the due date.

04/14: Hidden Markov Models

Subjects treated: Probabilistic Automata, Distribution over Sequences, Hidden Markov Models, Inference: Forward-Backward Algorithm, Learning: Expectation-Maximization algorithm.


04/21: Graphical Models, Belief Propagation

Subjects treated: Intro to graphical models, Inference, Belief Propagation, Boltzmann Machines,

Required Reading:

04/28: Learning, Sampling, and Energy-Based Models

Subjects treated: Learning in Graphical Models; Approximate Inference and Sampling, Markov-Chain Monte-Carlo, Hybrid Monte-Carlo; Energy-Based Models, Contrastive Divergence.

Required Reading: