VLG Group
Group Meetings
Y. LeCun's website
CS at Courant
Courant Institute

Machine Learning and Pattern Recognition: Assignments

[ Course Homepage | Schedule and Course Material | Mailing List ]
Skeleton code for the assignments are provided in the Lush language.

You may use other languages to implement your assignments. but it is likely that the time you will spend re-implementing infrastructure code will be larger than the time to get familiar with Lush (except perhaps for the first assignment).

For the adventurous among us, there is an open source C++ library called eblearn that implements all the functionalities of the Lush numerical library and machine learning library. It's still a bit rough, but usable.

Lush is available for Linux and Mac OS here: http://lush.sf.net.

If you only have a Windows machine, you have several solutions:

  • Use one of the CIMS server machines, Lush is already installed on them. You can access them remotely using ssh (e.g. "ssh -X access.cims.nyu.edu"), or you can go to one of the public computer labs at Warren Weaver Hall.
  • Install Ubuntu on your Windows machine without repartitioning your harddrive using Wubi. Wubi is a Windows app that installs Ubuntu in a file on your Windows partition (the file is a virtual hardrive from Linux's point of view). Wubi is not an emulator, so you have to boot your machine in either Linux or Windows. You can't run them at the same time.
  • download and install Ubuntu. The Ubuntu installer offers you the option of shrinking your Windows partition non-destructively so as to make space for Linux. The installation is very simple and takes about 20 minutes.
  • Install Ubuntu through VMWare so you can run Window and Ubuntu at the same time.
In any case, as a graduate student in Computer Science, you have to be exposed to Linux/Unix sooner or later.

Once you have Linux or Mac OS, get Lush and install it. It is recommended to install the CVS version, which has the latest updates and bug fixes. You can also install the released version on the Lush web site. Do not install the Lush package that comes with Ubuntu. It is badly out of date.

Before you can compile Lush, you must install a number of other packages, namely: gcc, g++, libx11-dev, libinutils-dev, indent, libreadline5, libreadline5-dev libgsl0, libgsl0-dev

Linear Machines and Regularization

Implementing the Perceptron Algorithm, MSE Classifier (linear regression), Logistic Regression. Details and datasets below:

  • Download this .tgz archive. It contains the datasets for all the homeworks.
  • Download this .tgz archive. It contains the homework description.
  • "cd" to a directory and decompress the two files in this directory using "tar xvfz thefile.tgz" on Unix/Linux or Mac.
  • This will create two directories: datasets and hw-linear.
  • The file hw-linear/README.txt contains the questions and instructions.
  • Most the of the necessary Lush code is provided.
  • Due Date is Tuesday October 13th,.

Neural Networks and Backprop


Click on this links to get the homework: hw-backprop.tgz.

Due Date is Tuesday October 27th.

K-Means and Mixtures of Gaussians

Click on this links to get the homework: hw-unsup.tgz.

Due Date is Tuesday November 10th.

Final Projects

click here for a list of final projects.

Projects can be done individually, in groups of 2, or (with permission) in groups of 3.

If you have an interest in a particular topic, you can propose your own project topic, subject to approval by the instructor. Send a description of your project proposal to the instructor and the TA before November 10th.

Otherwise, a list of possible projects will be proposed during the course of the semester (early November).

All projects are due December 18th.

There will be a project showcase and demo show on Thursday December 17th on the 13th floor of Warren Weaver Hall from 5:00 to 9:00 PM. Extra points will be given to those making a presentation (poster and/or demo) at the project showcase.

You must send a .tar or .tgz file to the TA with your code and a PDF file describing your project and the results you obtained.

It is expected that you will implement a learning algorithm yourself, or use an existing one in a new and interesting way.

In other words, merely downloading an SVM package off the web and running it on a standard dataset won't get you a good grade.

On rare occasions, some class projects have been known to turn into conference papers....

List of Proposed Projects

A more extensive list is coming soon.

Among other topics, we are looking for volunteers to implement various standard learning algorithms and demos for the C++ library eblearn.