Learning Sparse Representations for Vision

Tomaso Poggio
MIT

3:30 p.m., Mon. May 11 in 1221 CS
Reception: 4:30 p.m. in 2310 CS

Learning is becoming the central problem in trying to understand intelligence and in trying to develop intelligent machines. I will outline some of our recent efforts in the domain of vision to develop machines that learn and to understand brain mechanisms of learning. I will begin with some recent theoretical results on the problem of function approximation and sparse representations that connect regularization theory, Support Vector Machine Regression, Basis Pursuit Denoising and PCA techniques. I will then motivate the appeal of learning sparse representations from an overcomplete dictionary of basis functions in terms of recent results in two different fields: computer vision and neuroscience. In particular, we have developed a trainable object detection architecture that succeeds in learning a sparse representation from an overcomplete set of Haar wavelets to perform difficult object detection tasks. In neuroscience, physiological data from IT cortex suggest that individual neurons encode a large vocabulary of elementary shapes before converging on cells tuned to specific views of specific 3D objects.