Problem Solving by Teams of Heterogeneous Agents

Gregory D. Hager
Dept. of Computer Science
Yale University

2:30 pm Mon. Nov. 13 in 2310 CS&S

One of the common motivations for studying vision is to provide sensory feedback systems that are able to move about and manipulate their world. However, to date most "successful" systems have been limited in scope, brittle to the point of being contrived, and daunting in their hardware complexity. These limitations can usually be traced to two implicit design principles: first, that vision is a means for measuring geometry; and second, that general-purpose, real-time vision computation must be performed on specialized hardware.

In this talk, I will argue that by avoiding both of these assumptions, it is possible to build software systems that provide real-time vision processing and high positioning accuracy on common desktop hardware and inexpensive mechanical components. In both vision and control, the key idea is to design feedback mechanisms which exploit any available geometric and/or photometric invariants of the problem. As an illustration, I will describe an approach to hand-eye coordination which leads to provably convergent positioning without relying on accurate calibration. Given time and interest, I will also present a second example in the domain of visual tracking under changing illumination.

In order to facilitate the construction of vision-based tracking and control systems, we have constructed XVision, a software toolkit for performing visual tracking on common PC's and workstations. In addition to illustrating XVision in the context of the above examples, I will argue that our approach to vision is likely to lead to a variety of novel and interesting (and fun!) vision applications.