Code/Datasets – Poggio Lab

Models of Object Recognition:
Over the years, CBCL has written a number of different software packages implementing hierarchical models of object recognition (dubbed “HMAX” models) which are inspired by the ventral visual pathway. The most comprehensive — but now quite dated — description of this class of models is “A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex” [Serre, Kouh, Cadieu, Knoblich, Kreiman, and Poggio (2005)]. The table below has links to downloadable software.

CBCL SOFTWARE:

Description	Circa	Paper
GURLS – Grand Unified Regularized Least Squares {C}{C}{C}{C}We present GURLS, a least squares, modular, easy-to-extend software library for efficient supervised learning. GURLS is targeted to machine learning practitioners, as well as non- specialists. It offers a number state-of-the-art training strategies for medium and large-scale learning, and routines for efficient model selection. The library is particularly well suited for multi-output problems (multi-category/multi-label). GURLS is currently available in two independent implementations: Matlab and C++. It takes advantage of the favorable properties of regularized least squares algorithm to exploit advanced tools in linear algebra. Routines to handle computations with very large matrices by means of memory-mapped storage and distributed task execution are available. The package is distributed under the BSD license and is available for download at https://github.com/CBCL/GURLS.	2012	Tacchetti, Mallapragada, Santoro, and Rosasco

Description	Circa	Paper
Subtasks of Unconstrained Face Recognition (SUFR and SUFR-W) This collection consists of:1. Subtasks of Unconstrained Face Recognition synthetic datasets (SUFR). Example images of the dataset can be viewed in this presentation: VISAPP.2. SUFR-in the Wild (SUFR-W). A similar dataset to Labeled Faces in the Wild (LFW), but more difficult. It consists of ~13,000 natural images of 400 individuals. It was collected using a similar protocol to LFW, but the Zhu and Ramanan (2013) face detector (from this paper) was substituted for Viola-Jones, thus the faces appear with considerably more variability in 3D orientation than in LFW.	2014	Leibo JZ, Liao Q, Poggio T
A Large Video Database for Human Motion Recognition With nearly one billion online videos viewed everyday, an emerging new frontier in computer vision research is recognition and search in video. While much effort has been devoted to the collection and annotation of large scalable static image datasets containing thousands of image categories, human action datasets lack far behind.Here we introduce HMDB collected from various sources, mostly from movies, and a small proportion from public databases such as the Prelinger archive, YouTube and Google videos. The dataset contains 6849 clips divided into 51 action categories, each containing a minimum of 101 clips.The actions categories can be grouped in five types: (1) General facial actions smile, laugh, chew, talk. (2) Facial actions with object manipulation: smoke, eat, drink. (3) General body movements: cartwheel, clap hands, climb, climb stairs, dive, fall on the floor, backhand flip, handstand, jump, pull up, push up, run, sit down, sit up, somersault, stand up, turn, walk, wave. (4) Body movements with object interaction: brush hair, catch, draw sword, dribble, golf, hit something, kick ball, pick, pour, push something, ride bike, ride horse, shoot ball, shoot bow, shoot gun, swing baseball bat, sword exercise, throw. (5) Body movements for human interaction: fencing, hug, kick someone, kiss, punch, shake hands, sword fight.	2011	H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre (ICCV, 2011)
CNS (“Cortical Network Simulator”) A general GPU-based framework for the fast simulation of “cortically-organized” networks, defined as networks consisting of n-dimensional layers of similar cells.This is a fairly broad class, including more than just “HMAX” models. We have developed specialized CNS packages for HMAX feature hierarchy models (hmax), convolutional networks (cnpkg), and networks of Hodgkin-Huxley spiking cells (hhpkg).While CNS is designed for use with a GPU, it can run (much more slowly) without one. It does, however, require MATLAB.	2010	Mutch, Knoblich, & Poggio (2010)
hmin: A Minimal HMAX Implementation This is a simple reference implementation of HMAX, meant for illustration. It is a single-threaded, CPU-based, pure C++ implementation (but still called via MATLAB’s “mex” interface).The package contains C++ classes for layers and filters, and a main program that assembles them to implement one specific model.	2010
System for Mouse Behavior Recognition	2010	H. Jhuang, E. Garrote, X. Yu, V. Khilnani, T. Poggio, A. Steele and T. Serre. (Nat. Comms, 2010)

Description	Circa	Paper
The system consists of two modules: a feature computation module, and a classification module. The feature computation module is based on a biologically-inspired dorsal stream model, which is, in turn, inspired by the ventral stream model (Models of Object Recognition). The classifier in the classification module is SVM^hmm	2010	Jhuang, Garrote, Yu, Khilnani, Poggio, Steele, Serre (Nat. Comms. 2010)

Regularized Least Squares Classification
LEGACY SOFTWARE:
(these packages are no longer actively used within CBCL)

Description	Circa	Paper
“Bypass route only” code A pure MATLAB implementation of the four-layer (S1->C2b) “bypass route” model.	2005	Serre, Wolf & Poggio (CVPR 2005)Serre, Wolf, Bileschi, Riesenhuber & Poggio (PAMI 2007)
Code for experiments replicating human performance in the animal/no-animal rapid scene categorization task. Written using FHLib (below).	2007	Serre, Oliva & Poggio (PNAS 2007)Serre, Kouh, Cadieu, Knoblich, Kreiman & Poggio (AI memo 2005)
FHLib – Multiscale Feature Hierarchy Library A rather general framework that allows the definition of feedforward hierarchies having an arbitrary number of levels. MATLAB interface with a C++ back-end (CPU-based) for efficiency. Arbitrary filter kernels can be defined, but this requires creating new C++ classes and recompiling. This code is fairly well documented.	2007	Mutch & Lowe (IJCV 2008)
Model of object recognition with canonical normalization operations Like FHLib, allows feedforward hierarchies of arbitrary depth, and is also MATLAB with a C++ (CPU-based) back-end. Defining new filters is easier than under FHLib, but all filters must be of the form y = sum(f .* (x .^ p)) / (c + sum(f .* (x .^ q)) .^ r). Different settings of c, p, q, and r correspond to tuning, softmax, etc. Models run more slowly than under FHLib.	2006	Kouh & Poggio (Neural Comp. 2008)Zoccolan et al. (J. Neurosci. 2007)Cadieu, et al. (J. Neurophys. 2007)
Regularized Least-Squares MATLAB Toolkit	2002	Rifkin
Original “HMAX” code This model implementation is now obsolete and is no longer distributed.	1999	Riesenhuber and Poggio (Nat. Neurosci. 1999)

CBCL DATASETS:

CBCL SUPPLEMENTARY MATERIAL:

Serre et al. A Feedforward Architecture Accounts for Rapid Categorization,
Proceedings of the National Academy of Science, 2007 (paper available here)
Fast Read-out of Object Identify from Macaque Inferior Temporal Cortex

The CBCL data sets are covered by the following copyright.

Copyright 2000
Center for Biological & Computational Learning at MIT and MIT
All rights reserved.
Permission to copy and modify this data, software, and its documentation only for internal research use in your organization is hereby granted, provided that this notice is retained thereon and on all copies. This data and software should not be distributed to anyone outside of your organization without explicit written authorization by the author(s) and MIT. It should not be used for commercial purposes without specific permission from the authors and MIT. MIT also requires written authorization by the author(s) to publish results obtained with the data or software and possibly citation of relevant CBCL reference papers.
We make no representation as to the suitability and operability of this data or software for any purpose. It is provided “as is” without express or implied warranty.