February 21 David Page UW. Madison Department of Biostatistics and Medical Informatics, Department of Computer Science, and Comprehensive Cancer Center Supervised Learning for Gene Expression Microarray Data: Comparative Experiments with Multiple Myeloma This talk compares leading supervised learning algorithms on a new gene expression microarray data set for cancer with over 100 samples. The paper provides evidence for several important lessons, including (1) Bayes nets and simple ensembles perform at least as well as other approaches but arguably provide more direct insight and (2) looking for consistent differences in expression may be more important than looking for large differences. The types of supervised learning algorithms employed are decision trees, boosted decision trees, support vector machines, unweighted voting between decision stumps, and Bayes nets for classification, which may be viewed as sophisticated weighted voting. The data set is based on Affymetrix "gene chip" (TM) technology and comes from a study of multiple myeloma, a presently-incurable blood cancer. This is joint work with Fenghuang Zhan, James Cussens, Mike Waddell, Jo Hardin, Bart Barlogie, and John Shaughnessy.