Learning Sparse Perceptrons

Mark Craven
Computer Sciences Dept.
University of Wisconsin-Madison

E-mail: craven@cs.wisc.edu

2:30 pm Fri. Dec. 15 in 2310 Computer Sciences and Statistics

An issue that arises whenever machine-learning algorithms are applied in practice is deciding how to represent problem instances to the learning system. For learning algorithms that use feature-value representations, this decision entails selecting a set of features that are to be used to represent each instance. Theoretical and empirical results in machine learning have shown that irrelevant features can lead to degradation in both the accuracy and the interpretability of induced hypotheses. I will present a new algorithm that is designed to learn in the presence of irrelevant features by being economical in incorporating features into its induced hypotheses. Specifically, our algorithm learns sparse perceptrons; that is, linear discriminant functions over relatively few terms. The terms in our sparse perceptrons are individual input features and conjunctions of input features. Our Boosting-based Perceptron (BBP) learning algorithm is based on a recent hypothesis-boosting algorithm and thus gives certain theoretical guarantees for the learnability of target functions that are sparse perceptrons. Furthermore, I will present experiments that show that on several real-world learning tasks, our BBP algorithm induces hypotheses that are both accurate and relatively comprehensible.