Nonlinear Mixture Models for Time Series Analysis: Discovering Regimes and Avoiding Overfitting

Prof. Andreas Weigend
University of Colorado at Boulder

12:05pm Tuesday May 9 in 8417 Social Science

When trying to forecast the future behavior of real-world systems, two of the key problems are overfitting (particularly serious for noisy processes) and regime switching (the underlying process changes its characteristics). In this talk we show how Gaussian mixture models with nonlinear experts point to solutions to these problems. In connectionist terms, the architecture consists of several experts and a gating net. The nonlinear experts put out the conditional mean (as usual), but each expert also has its own adaptive width. The gating net puts out an input-dependent probability for each expert. There is a supervised component in learning, to predict the next value(s), and an unsupervised component, to discover the (hidden) regimes. We report a number of results:

(1) the gating net discovers the different regimes that underlie the process: its outputs segment the data correctly into the different regions.

(2) the widths associated with each expert characterize the sub-processes: i.e., the variances give the expected squared error for each regime.

(3) there is significantly less overfitting compared to single nets, for two reasons: only subsets of the potential inputs are given to the experts and gating net, and the experts learn to match their variances to the (local) noise levels, thus only learning as much as the data support.

We compare these results from the mixture model to single networks of different sizes, as well as to nets with two outputs, one for the mean, the other one for the confidence interval as a function of the input. Several data sets are used: a computer-generated series, the laser data set from the Santa Fe Competition, the daily electricity demand of France, and the daily exchange rate between German marks and US dollars.