
This page contains the schedule, slide from the lectures, lecture notes, reading lists,
assigments, and web links.
I urge you to download the DjVu viewer
and view the DjVu version of the documents below. They display faster,
are higher quality, and have generally smaller file sizes than the PS and PDF.
Fulltext search is provided for the entire
collection of slides and papers. Click here to search
01/21: Introduction and basic concepts 
Subjects treated: Intro, types of learning, nearest neighbor, how biology does it,
linear classifier, perceptron learning procedure, linear regression,
training/test, capacity, overfitting, regularization, Occam's Razor, MDL.
Slides: [DjVu  PDF  PS]
Required Reading:
 Hastie/Tibshirani/Friedman: Chapter 2
Optional Reading:
 Refresher on random variables and probabilites by
Andrew Moore: (slides 127) [DjVu  PDF]
 Refresher on joint probabilities, Bayes theorem by
Chris Willams: [DjVu  PDF]
 Refresher on statistics and probabilities by
Sam Roweis: [DjVu  PS]
 If you are interested in the early history of selforganizing
systems and cybernetics, have a look at this book available from the
Internet Archive's Million Book Project: SelfOrganizing
Systems, proceedings of a 1959 conference edited by Yovits and
Cameron (DjVu viewer required for full text).
01/28: Probability Theory, Bayes Inversion, Bayes Decision Rule 
Subjects treated: Refreshers on probability theory.
Bayes decision rule, naive Bayes classifier, logistic regression.
Refresher on multivariate calculus and optimization.
Slides
Required Reading:
 Hastie/Tibshirani/Friedman: Sections 3.1, 3.2, 3.4.1 to 3.4.3, 4.1, 4.2, 4.4, 4.5
Optional Reading:
 Duda/Hart/Stork: Sections 5.1 to 5.8
 Bishop: Chapter 3
 Hastie/Tibshirani/Friedman: Sections 3.3, 4.3
 Paper on logistic regression by
Michael Jordan: [DjVu  PS]
02/04: MLE, MAP, Energy Functions 
Subjects treated: Bayesian Estimation, Maximum Likelihood
Estimation, MAP Estimation, Loss Functions and EnergyBased models.
Probability, Entropy, Energy, and Free Energy. Introduction to Lush.
Slides: [DjVu  PDF  PS]
Required Reading:
Optional Reading:
Homework Assignements: (see next lecture)
02/11: GradientBased Learning I: Beyond Linear Classifiers 
Subjects treated: Intro to GradientBased Learning.
Limitations of linear classifiers. Basis function expansion, polynomial classifiers,
kernel expansion, RBF Networks, Simple multilayer neural nets.
Optimization and the convergence of gradientbased learning.
Slides:
Required Reading:
 Gradientbased Learning Applied to Document Recognition by LeCun,
Bottou, Bengio, and Haffner, pages 15 (Introduction):
[ DjVu  .ps.gz ]
 Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 15:
[ DjVu  .ps.gz ]
Homework Assignements: implementing the Perceptron
Algorithm, MSE Classifier (linear regression), Logistic Regression.
Details and datasets below:
 Download this tar.gz archive. It
contains the datasets and the homework description.
 Decompress it with "tar xvfz homework01.tgz" on Unix/Linux or
with Winzip in Windows.
 The file homework01.txt contains the questions and instructions.
 Most the of the necessary Lush code is provided.
 Due Date is Wednesday March 3, before the lecture.
02/18: GradientBased Learning II: Multilayer Networks and BackPropagation 
Subjects treated: MultiModule learning machines. Vector
modules and switches. Multilayer neural nets. Backpropagation
Learning.
Slides:
02/25: GradientBased Learning III: Special Architectures 
Subjects treated: Special architectures: RBF nets, mixtures
of experts, parameterspace transforms. Implementation and practical
issues with multimodule/multilayer learning machines. Intro to
convolutional nets.
Slides:
Required Reading:
 Convolutional nets: "Gradientbased Learning Applied to Document Recognition" by LeCun,
Bottou, Bengio, and Haffner, pages 518 (up to and including section IVB ):
[ DjVu  .ps.gz ]
 On the Lagrangian formulation of gradientbased learning:
"A theoretical framework for backpropagation":
[ DjVu  .ps.gz ]
 Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 6end:
[ DjVu  .ps.gz ]
Optional Reading:
 Multimodule Approach and Lagrangian formulation: "a framework for
the cooperation of learning algorithms" by Bottou and Gallinari:
DjVu.
03/03: Convolutional Nets. CrossValidation, Model Selection, Learning Theory 
Subjects treated: Invariant Recognition, Feature Learning,
Convolutional Networks and TimeDelay Neural Nets.
Model Selection, CrossValidation, VCdimension,
Structural Risk Minimization, Bagging.
Slides:
Homework Assignements: implementing GradientBased Learning
and backpropagation. You must implement gradientbased learning using
the objectoriented, modulebased approach as described in class.
Various architectures, including a multilayer neural net, must be
implemented and tested on two datasets.
 Download this tar.gz archive. It
contains the datasets and the homework description.
 Decompress it with "tar xvfz homework01.tgz" on Unix/Linux or
with Winzip in Windows.
 The file homework02.txt contains
the questions and instructions.
 Most of the necessary Lush code is provided.
 Due Date is Friday April 2 (NEW NEW DATE!).
03/10: Unsupervised Learning 
Subjects treated: Unsupervised Learning: Principal Component
Analysis. Density Estimation: Parzen Windows, Mixtures of Gaussians,
AutoEncoders. Latent variables and the EstimationMaximization algorithm.
Slides:
Spring break: NO CLASS.
03/24: Guest Lecture by Prof. Lawrence Saul: Dimensionality Reduction 
Subjects treated: NonLinear Dimensionality Reduction and
Embedding: Guest Lecture by Prof. Lawrence Saul
from University of Pennsylvania:
Slides:
 L. Saul's Lecture Slides on nonlinear dimensionality reduction
(caution: the PS and the PDF are over 25MB, the DjVu is 2MB): [DjVu  PDF  PS]
Required Reading: (please read this before the class)
 L. K. Saul and S. T. Roweis (2003). Think globally, fit locally:
unsupervised learning of low dimensional manifolds.
Journal of Machine Learning Research 4:119155.
[PDF].
Optional Reading:
03/31: Efficient Optimization, Latent Variables, Graph Transformer Networks 
Subjects treated:
Efficient learning: conjugate gradient, LevenbergMarquardt.
Lagrange Multipliers and Constrained Optimization.
More on latent variables and EM.
Modeling distributions over sequences. Learning machines that
manipulate graphs. Finitestate transducers. Graph Transformer
Networks.
Required Reading:
Homework Assignements: Homework 03: KMeans and Mixture of Gaussians estimation with EM.
 The subject of this homework is to implement the Kmeans algorithm
and the ExpectationMaximization algorithm for a Mixture of Gaussians model.
The algorithms must be tested on image data for simulated image
compression taks.
 Download this tar.gz archive. It
contains the datasets and the homework description.
 Decompress it with "tar xvfz homework03.tgz" on Unix/Linux or
with Winzip in Windows.
 The file homework03.txt contains
the questions and instructions.
 DUE DATE: Friday April 16
04/07: Boosting and Support Vector Machines 
This lecture will be given by Prof. Dan Melamed.
Subjects treated: Boosting, and
Ensemble Methods. Maximum Margin Classifiers.
Support Vector Machines, Kernel Machines.
Homework Assignements: Final Project
 A list of possible project topics is
available here.
Make a proposal (send an email message to me and
to the TA).
 This project will count for a lot in the final grade.
 Collaboration: you can do your final project in groups of two students.
 Due Date: Friday, May 14. Extensions may be granted for
ambitious projects by students who are not graduating this year.
if you intend to graduate this year, you must return your
project by the due date.
04/14: Hidden Markov Models 
Subjects treated: Probabilistic Automata, Distribution
over Sequences, Hidden Markov Models, Inference: ForwardBackward
Algorithm, Learning: ExpectationMaximization algorithm.
Reading:
04/21: Graphical Models, Belief Propagation 
Subjects treated: Intro to graphical models,
Inference, Belief Propagation, Boltzmann Machines,
Required Reading:
04/28: Learning, Sampling, and EnergyBased Models 
Subjects treated: Learning in Graphical Models;
Approximate Inference and Sampling, MarkovChain MonteCarlo,
Hybrid MonteCarlo; EnergyBased Models, Contrastive Divergence.
Required Reading:

