


Runs n-fold cross validation on data with a given classifier.
Given n separate labeled data sets, trains classifier using n-1 data sets, test on
remaining one. Average results over all n such runs. Shows overall results in average
confusion matrix.
The classifier is passed in as a parameter. For this to work the classifier (clf) must
follow certain conventions. The conventions are as follows:
1) The following must initialize the clf ('p' is the dimension of the data):
clf = clfinit( p, clfparams{:} )
2) The created clf must point to 2 functions for training and applying it:
clf.fun_train and clf.fun_fwd
3) For training the following will be called:
clf = clf.fun_train( clf, X, Y );
4) For testing the following will be called:
pred = clf.fun_fwd( clf, Xtest );
The format for X is nxp where n is the number of data points and p is their dimension.
The format for Y is nx1. Example of a classifier is: clfinit = @clf_lda
Given data in a cell array format, it might be useful to string out into single array:
IDX = cell2mat( permute( IDX, [2 1] ) ); data = cell2mat( permute( data, [2 1] ) );
For a simple, small dataset, can do the following to do leave one out classification:
[n,p]=size(data); IDX=mat2cell(IDX,ones(1,n),1); data=mat2cell(data,ones(1,n),p);
Overall error can be calculated via:
er = 1-sum(diag(CM))/sum(CM(:))
Normalized confusion matrix can be calculated via:
CMn = CM ./ repmat( sum(CM,2), [1 size(CM,2)] );
INPUTS
data - cell array of (n x p) arrays each of n samples of dim p
IDX - cell array of (n x 1) arrays each of n labels
clfinit - classifier initialization function
clfparams - classifier parameters
types - [optional] cell array of string labels for types
ignoretypes - [optional] array of types we aren't interested in {eg: [1 4 5]}.
fname - [optional] specify a file to save CM to, as well as image
show - [optional] will display results in figure(show)
OUTPUTS
CM - confusion matrix
EXAMPLE
load clf_data; % 2 class data
nfoldxval( data, IDX, @clf_lda,{'linear'}, [],[],[],1 ); % LDA
nfoldxval( data, IDX, @clf_knn,{4},[],[],[],2 ); % 4 k nearest neighbor
nfoldxval( data, IDX, @clf_svm,{'poly',2},[],[],[],3 ); % polynomial SVM
nfoldxval( data, IDX, @clf_svm,{'rbf',2^-12},[],[],[],4 ); % rbf SVM
nfoldxval( data, IDX, @clf_dectree,{},[],[],[],5 ); % decision tree
% for multi-class data
nfoldxval( data, IDX, @clf_ecoc,{@clf_svm,{'rbf',2^-12},nclasses},[],[],[],6 ); % ECOC
DATESTAMP
11-Oct-2005 2:45pm
See also CLF_LDA, CLF_KNN, CLF_SVM, CLF_ECOC