Foundations of Machine Learning -- CSCI-GA.2566-001

Spring 2012
Foundations of Machine Learning
Course#: CSCI-GA.2566-001
Instructor: Mehryar Mohri
Graders/TAs: Umar Syed and Afshin Rostami
Mailing List

Course Description

This course introduces the fundamental concepts and methods of machine learning, including the description and analysis of several modern algorithms, their theoretical basis, and the illustration of their applications. Many of the algorithms described have been successfully used in text and speech processing, bioinformatics, and other areas in real-world products and services. The main topics covered are:

Probability and general bounds
PAC model
VC-dimension
Perceptron, Winnow
Support vector machines (SVMs)
Kernel methods
Decision trees
Boosting
Regression problems and algorithms
Ranking problems and algorithms
Halving algorithm, weighted majority algorithm, mistake bounds
Learning automata and transducers
Reinforcement learning, Markov decision processes (MDPs)

It is strongly recommended to those who can to also attend the Machine Learning Seminar.

Location and Time

Room 102 Warren Weaver Hall,
251 Mercer Street.
Mondays 5:00 PM - 6:50 PM.

Prerequisite

Familiarity with basics in linear algebra, probability, and analysis of algorithms.

Projects and Assignments

There will be 3 to 4 assignments and a project. The final grade is essentially the average of the assignment and project grades. The standard high level of integrity is expected from all students, as with all CS courses.

Lectures

Lecture 01: Introduction to machine learning, probability review.
Lecture 02: PAC model, sample complexity for finite hypothesis sets, concentration bounds.
Lecture 03: VC dimension, Rademacher complexity, learning bounds for infinite hypothesis sets.
Lecture 04: Support vector machines, margin bounds.
Lecture 05: Kernel methods.
Lecture 06: Boosting.
Lecture 07: On-line learning.
Lecture 08: Regression.
Lecture 09: Multi-class classification.
Lecture 10: Ranking.
Lecture 11: Reinforcement learning.
Lecture 12: Learning languages.

Textbooks

The following textbook covers all the material presented in this course (and a lot more):

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. MIT Press, 2012 (to appear).

Parts of that material might be made available to the students at the time of the lectures.

Here is also a list of other books recommended for further reading:

Martin Anthony and Peter Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.
Nicolo Cesa-Bianchi and Gabor Lugosi. Prediction, learning, and games. Cambridge University Press, 2006.
Luc Devroye, Laszlo Gyorfi, Gabor Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1996.
Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994.
Bernhard Schoelkopf and Alex Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.
Vladimir N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer NY (new edition), 2006.
Vladimir N. Vapnik. Statistical Learning Theory. Wiley, 1998.

Technical Papers

An extensive list of recommended papers for further reading is provided in the lecture slides.

Homework

Previous years