Extracting Symbolic Representations from Trained Neural Networks

Mark Craven
Computer Sciences Dept.
University of Wisconsin-Madison

1:30 pm Mon. Feb. 12 in 2310 CS&S

Two of the most important criteria for evaluating the performance of machine learning systems are the predictive accuracy and the comprehensibility of learned solutions. Neural networks have been used to develop highly accurate classifiers for numerous real-world problems. A significant limitation of neural networks, however, is that their learned solutions are usually extremely difficult to understand. I will present a novel algorithm, called TREPAN, that is able to extract comprehensible, symbolic representations from trained neural networks. TREPAN uses queries to a given network to induce a decision tree that describes the concept represented by the network. I have evaluated TREPAN using several interesting, real-world domains. I will present experiments that show that TREPAN is able to produce decision trees that are accurate and comprehensible, and maintain a high level of fidelity to the networks from which they were extracted. Unlike previous work in this area, my algorithm is very general in its applicability, and scales well to large networks and problems with high-dimensional input spaces.