**Meeting time:**Wednesdays, from 11:30AM to 1:30PM

**Venue:**Room 1221, 719 Broadway, New York, NY - 10003

## January 23rd, 2008

Talk by Y-Lan BoureauLearning Deep Architectures for AI, by Yoshua Bengio

## February 6th, 2008

Talk by Piotr MirowskiPiotr will be talking about the following article, and about the project that he did based on this article for his Computational Neuroscience class last semester. The paper talks about sigmoid transfer functions, weighted sums, an additional 2nd order ODE, (stochastic) machine learning (genetic algorithms), all this for modeling EEG brain activity during epileptic seizures.

Interictal to Ictal Transition in Human Temporal Lobe Epilepsy: Insights From a Computational Model of Intracerebral EEG, by Fabrice Wendling et al

Journal of Clinical Neurophysiology, 22(5), 2005

Background information about this topic can be found at

Some Insights into the Computation Models of (Patho)Physiological Brain Activity, by Piotr Suffczynski, Fabrice Wendling, et al

Proceedings of the IEEE, 94(4), 2006

Sections I, II, IVb, IVc, V, VII, VIII provide a good review of the problematic of brain modeling

Neural dynamics underlying brain thalamic oscillations investigated with computational models, by Piotr Suffczynski

Thesis of University of Warsaw, 2000

Exrtemely short, almost one paragraph long sections 1.2, 1.3, 2.1, 2.2, 4.1. Provides a nice introduction to EEG recording.

## February 13th, 2008

TBA## February 20th, 2008

Talk by Marc'Aurelio RanzatoMarch'Aurelio will be talking about Compressed Sensing. The related paper is

From sparse solutions of systems of equations to sparse modeling of signals and images, by Bruckstein, Donoho

Talk by Koray Kavukcuoglu

Koray will be talking about K-SVD. The related papers are

The K-SVD: An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation, by M. Aharon, M. Elad, and A.M. Bruckstein

In the IEEE Trans. On Signal Processing, Vol. 54, no. 11, pp. 4311-4322, November 2006.

Learning Multiscale Sparse Representations for Image and Video Restoration, by J. Mairal, G. Sapiro, and M. Elad

To appear SIAM Multiscale Modeling and Simulation.

## February 27th, 2008

Talk by Mathew Koichi GrimesMatt will talk about the following paper

Exponential Family Harmoniums with Applications to Information Retrieval, by Welling, Rozen-Zvi, and Hinton

## March 5th, 2008

Talk by John LangfordYahoo Research

**Title:**Learning without the Loss

**: Abstract: In many natural situations, you can probe the loss (or reward) for one action, but you do not know the loss of other actions. This problem is simpler and more tractable than reinforcement learning, but still substantially harder than supervised learning because it has an inherent exploration component. I will discuss two algorithms for this setting.**

Abstract

Abstract

(1) Epoch-greedy, which is a very simple method for trading off between exploration and exploitation.

(2) Offset Tree, which is a method for reducing this problem to binary classification.

## March 12th, 2008

Talk by Jihun HammPh.D Candidate, Grasp Laboratory, University of Pennsylvania

**Title:**Generative and discriminant learning approaches toward invariant object recognition

**Abstract:**In this talk I will present generative and discriminative learning approaches toward invariant face/object recognition problems. Images of objects and human faces shows multiple sources of variations that make recognition problems challenging. Among the various sources the pose and the illumination changes are of our interest, since how these factors affect the appearance is relatively well understood.

In the first part of the talk, we first discuss a generative model of face images. In this model image variations due to illumination change are accounted for by a low-dimensional linear subspace, whereas variations due to pose change are approximated by a geometric transformation of images in the subspace. Priors for transformation can be derived without the knowledge of 3D models of the faces. This model can be efficiently learned via MAP estimation and multiscale registration techniques. Furthermore, we also show that the priors can also be used in a discriminant setting in the form of a regularizer. We demonstrate how to combine multiple invariances into Linear Discriminant Analysis and nonparametric Discriminant Analysis, as well as the kernelized versions of those.

In the second part of the talk, we will take a novel view on the problems involving linearly subspaces. By treating subspaces as basic elements of data, we can make learning algorithms adapt naturally to the problems with linearly invariant structures. We propose a unifying view on the subspace-based learning method by formulating the problems on the Grassmann manifold, which is the set of fixed-dimensional subspaces of a Euclidean space. We show feasibility of the approach by using the Grassmann kernel functions such as the Projection kernel and the Binet-Cauchy kernel. Experiments with real image databases show that the proposed method performs well compared with state-of-the-art algorithms.

If time allows, I will additionally address a dimensionality reduction method with invariance knowledge.

## April 16th, 2008

Talk by Michael LittmanDepartment of Computer Science, Rutgers University

**Title**: Efficient Model Learning for Reinforcement Learning

**Abstract**:

This talk addresses the problem of learning efficiently to make sequential decisions with a particular focus on generalizing experience without forfeiting formal learning-time guarantees. I'll summarize the theoretically motivated algorithms my group has been developing that exhibit practical advantages over existing learning algorithms. I'll also toss in some video footage of robots learning to move around by exploring efficiently.

## April 23rd, 2008

Talk by David BleiDepartment of Computer Science, Princeton University

**Title**: Supervised Topic Models (joint work with Jon McAuliffe)

**Abstract**:

A surge of recent research in machine learning and statistics has developed new techniques for finding patterns of words in document collections using hierarchical probabilistic models. These models are called "topic models" because the discovered word patterns often reflect the underlying topics that permeate the documents; however topic models also naturally apply to data such as images and biological sequences. The first part of this talk will describe the basic algorithmic and modeling issues in topic modeling.

In the second part of the talk, I will introduce supervised latent Dirichlet allocation (sLDA), a topic model of labelled documents that accomodates a variety of response types. I will derive a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations. Prediction problems motivate this research: I will present results on predicting movie ratings from the text of reviews, and predicting web-page popularity from summaries of their contents. I will report comparisons of sLDA to modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression.

## May 1st, 2008

Talk by Tony JebaraDepartment of Computer Science, Columbia University

**Title**: Visualization & Matching for Graphs and Data

**Abstract**:

Given a graph between N high-dimensional nodes, can we faithfully visualize it in just a few dimensions? We present an algorithm that improves the state-of-the art in dimensionality reduction by extending the Maximum Variance Unfolding method. Visualizations are shown for social networks, species trees, image datasets and human activity.

If the connectivity between N nodes is unknown, can we link them to build a graph? The space to explore is daunting with 2^(N^2) choices but two interesting subfamilies are tractable: matchings and b-matchings. We place distributions over these families and recover the optimal graph or perform Bayesian inference over graphs efficiently using belief propagation algorithms. Higher order distributions over matchings can also be handled efficiently via fast Fourier algorithms. Applications are shown in tracking, network reconstruction, classification, and clustering.

**Bio**:

Tony Jebara is Associate Professor of Computer Science at Columbia University and director of the Columbia Machine Learning Laboratory. His research intersects computer science and statistics to develop new frameworks for learning from data with applications in vision, networks, spatio-temporal data, and text. Jebara is also co-founder and head of the advisory board at Mao Networks. He has published over 50 peer-reviewed papers in conferences and journals including NIPS, ICML, UAI, COLT, JMLR, CVPR, ICCV, and AISTAT. He is the author of the book Machine Learning: Discriminative and Generative (Kluwer). Jebara is the recipient of the Career award from the National Science Foundation and has also received honors for his papers from the International Conference on Machine Learning and from the Pattern Recognition Society. He has served as chair and program committee member for many learning conferences. Jebara's research has been featured on television (ABC, BBC, New York One, TechTV, etc.) as well as in the popular press (Wired Online, Scientific American, Newsweek, Science Photo Library, etc.). He obtained his PhD in 2002 from MIT. Jebara's lab is supported in part by the Central Intelligence Agency, the National Science Foundation, the Office of Naval Research, the National Security Agency, and Microsoft.