Learning Sparse Perceptrons
Mark Craven
Computer Sciences Dept.
University of Wisconsin-Madison
E-mail: craven@cs.wisc.edu
2:30 pm Fri. Dec. 15 in 2310 Computer Sciences and Statistics
An issue that arises whenever machine-learning algorithms are applied in
practice is deciding how to represent problem instances to the learning
system. For learning algorithms that use feature-value representations,
this decision entails selecting a set of features that are to be used to
represent each instance. Theoretical and empirical results in machine
learning have shown that irrelevant features can lead to degradation in
both the accuracy and the interpretability of induced hypotheses.
I will present a new algorithm that is designed to learn in the presence of
irrelevant features by being economical in incorporating features into its
induced hypotheses. Specifically, our algorithm learns sparse perceptrons;
that is, linear discriminant functions over relatively few terms.
The terms in our sparse perceptrons are individual input features and
conjunctions of input features. Our Boosting-based Perceptron (BBP) learning
algorithm is based on a recent hypothesis-boosting algorithm and thus
gives certain theoretical guarantees for the learnability of target functions
that are sparse perceptrons. Furthermore, I will present experiments that
show that on several real-world learning tasks, our BBP algorithm induces
hypotheses that are both accurate and relatively comprehensible.