Observing the Universe Can Drown You in Images: Data Mining Solutions at JPL
Usama Fayyad,
Microsoft Research (formerly at JPL)
2:30 pm Wed. May. 1 in 1325 CS&S
Modern science instruments can gather data at rates that make traditional
inspection by humans infeasible. Techniques for automating the initial stages
of analysis to allow analysts to reduce data so that it is analyzable by
traditional methods are becoming a necessity in many fields. The talk will
describe efforts to develop a new generation of data mining systems where
users specify what to search for simply by providing the system with training
examples, and letting the system automatically learn what to do. The system
would then sift through the data and catalog objects of interest for analysis
purposes.
Two applications at JPL will be used to illustrate the techniques
and their effects. The first targets automating the cataloging of sky objects
in digitized sky survey consisting of three terabytes of image data and
containing on the order of two billion sky objects. The system (SKICAT) allows
for automated and accurate classification, enabling the automated cataloging
of billions of objects, the majority of which being too faint for visual
recognition by astronomers. The second part of the talk will cover JARtool
(JPL Adaptive Recognition Tool) targeting the detection and cataloging of
about 1 million small volcanoes visible in the Magellan SAR database of over
30,000 images of Venus.
The techniques described are applicable to a wide
range of problems and have little to do with the fact that the data happens to
be images. Potential applications include medical imaging, automated
inspection and diagnosis in manufacturing, decision support systems, database
marketing, and summarization/visualization of large databases. More
information on the JPL Machine Learning Systems Group is at
http://www-aig.jpl.nasa.gov/mls/.