Computer Science Colloquium

Object and Scene Recognition with Bags of Features and Spatial Pyramids

Svetlana Lazebnik
Beckman Institute, University of Illinois at Urbana-Champaign

Wednesday, April 4th 11:15 a.m.
Room 1302 Warren Weaver Hall
251 Mercer Street
New York, NY 10012-1185

Colloquium Information:


Richard Cole, (212) 998-3119


Bag-of-features models, which represent images by distributions of salient local features contained in them, are among the most robust and powerful image descriptions currently used for object and scene recognition. In this talk, I will present fundamental techniques for designing effective bag-of-features models and their extensions by constructing discriminative visual codebooks and incorporating spatial relationships between local features.

The most basic operation in building a bag-of-features model is quantizing the local features, so that their distribution can be represented as a histogram of discrete "visual codewords." I will introduce an information-theoretic approach to designing visual codebooks by minimizing the loss of discriminative information incurred when a continuous high-dimensional feature vector is mapped to a discrete codeword index. I will present experiments demonstrating the advantage of these codebooks for image classification, as well as an application of the same information-theoretic framework to image segmentation.

In the second part of the talk, I will describe an extension of a bag of features into a spatial pyramid, or a collection of feature histograms computed at different levels of a hierarchical spatial decomposition of an image. The resulting method is simple and efficient, and it achieves state-of-the-art performance on difficult object and scene recognition tasks. It has already been adopted as a baseline for datasets containing hundreds of object categories, and has given rise to a winning recognition system in the international PASCAL Visual Object Classes Challenge.

Refreshments will be served

top | contact