Code/Datasets

Models of Object Recognition:
Over the years, CBCL has written a number of different software packages implementing hierarchical models of object recognition (dubbed "HMAX" models) which are inspired by the ventral visual pathway. The most comprehensive -- but now quite dated -- description of this class of models is "A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex" [Serre, Kouh, Cadieu, Knoblich, Kreiman, and Poggio (2005)]. The table below has links to downloadable software.

CBCL SOFTWARE:
Description Circa Paper

GURLS - Grand Unified Regularized Least Squares

{C}{C}{C}{C}

We present GURLS, a least squares, modular, easy-to-extend software library for efficient supervised learning. GURLS is targeted to machine learning practitioners, as well as non- specialists. It offers a number state-of-the-art training strategies for medium and large-scale learning, and routines for efficient model selection. The library is particularly well suited for multi-output problems (multi-category/multi-label). GURLS is currently available in two independent implementations: Matlab and C++. It takes advantage of the favorable properties of regularized least squares algorithm to exploit advanced tools in linear algebra. Routines to handle computations with very large matrices by means of memory-mapped storage and distributed task execution are available. The package is distributed under the BSD license and is available for download at https://github.com/CBCL/GURLS.

2012 Tacchetti, Mallapragada, Santoro, and Rosasco

Description Circa Paper

Subtasks of Unconstrained Face Recognition (SUFR and SUFR-W)

This collection consists of:

1. Subtasks of Unconstrained Face Recognition synthetic datasets (SUFR). Example images of the dataset can be viewed in this presentation: VISAPP.

2. SUFR-in the Wild (SUFR-W). A similar dataset to Labeled Faces in the Wild (LFW), but more difficult. It consists of ~13,000 natural images of 400 individuals. It was collected using a similar protocol to LFW, but the Zhu and Ramanan (2013) face detector (from this paper) was substituted for Viola-Jones, thus the faces appear with considerably more variability in 3D orientation than in LFW.

2014 Leibo JZ, Liao Q, Poggio T

A Large Video Database for Human Motion Recognition

With nearly one billion online videos viewed everyday, an emerging new frontier in computer vision research is recognition and search in video. While much effort has been devoted to the collection and annotation of large scalable static image datasets containing thousands of image categories, human action datasets lack far behind.

Here we introduce HMDB collected from various sources, mostly from movies, and a small proportion from public databases such as the Prelinger archive, YouTube and Google videos. The dataset contains 6849 clips divided into 51 action categories, each containing a minimum of 101 clips.

The actions categories can be grouped in five types:
(1) General facial actions smile, laugh, chew, talk.
(2) Facial actions with object manipulation: smoke, eat, drink.
(3) General body movements: cartwheel, clap hands, climb, climb stairs, dive, fall on the floor, backhand flip, handstand, jump, pull up, push up, run, sit down, sit up, somersault, stand up, turn, walk, wave.
(4) Body movements with object interaction: brush hair, catch, draw sword, dribble, golf, hit something, kick ball, pick, pour, push something, ride bike, ride horse, shoot ball, shoot bow, shoot gun, swing baseball bat, sword exercise, throw.
(5) Body movements for human interaction: fencing, hug, kick someone, kiss, punch, shake hands, sword fight.

2011 H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre (ICCV, 2011)
CNS ("Cortical Network Simulator")

A general GPU-based framework for the fast simulation of "cortically-organized" networks, defined as networks consisting of n-dimensional layers of similar cells.

This is a fairly broad class, including more than just "HMAX" models. We have developed specialized CNS packages for HMAX feature hierarchy models (hmax), convolutional networks (cnpkg), and networks of Hodgkin-Huxley spiking cells (hhpkg).

While CNS is designed for use with a GPU, it can run (much more slowly) without one. It does, however, require MATLAB.

 

2010 Mutch, Knoblich, & Poggio (2010)
hmin: A Minimal HMAX Implementation

This is a simple reference implementation of HMAX, meant for illustration. It is a single-threaded, CPU-based, pure C++ implementation (but still called via MATLAB's "mex" interface).

The package contains C++ classes for layers and filters, and a main program that assembles them to implement one specific model.

 

2010

 

2010

H. Jhuang, E. Garrote, X. Yu, V. Khilnani, T. Poggio, A. Steele and T. Serre. (Nat. Comms, 2010)

 

Description Circa Paper

The system consists of two modules: a feature computation module, and a classification module. The feature computation module is based on a biologically-inspired dorsal stream model, which is, in turn, inspired by the ventral stream model (Models of Object Recognition). The classifier in the classification module is SVMhmm

 

2010

Jhuang, Garrote, Yu, Khilnani, Poggio, Steele, Serre (Nat. Comms. 2010)

 

 

CBCL DATASETS:

CBCL SUPPLEMENTARY MATERIAL:

The CBCL data sets are covered by the following copyright.

 

Copyright 2000
Center for Biological & Computational Learning at MIT and MIT
All rights reserved.
Permission to copy and modify this data, software, and its documentation only for internal research use in your organization is hereby granted, provided that this notice is retained thereon and on all copies. This data and software should not be distributed to anyone outside of your organization without explicit written authorization by the author(s) and MIT. It should not be used for commercial purposes without specific permission from the authors and MIT. MIT also requires written authorization by the author(s) to publish results obtained with the data or software and possibly citation of relevant CBCL reference papers.
We make no representation as to the suitability and operability of this data or software for any purpose. It is provided "as is" without express or implied warranty.