nfoldxval

PURPOSE ^

Runs n-fold cross validation on data with a given classifier.

SYNOPSIS ^

function CM=nfoldxval( data, IDX, clfinit, clfparams, types, ignoretypes, fname, show )

DESCRIPTION ^

 Runs n-fold cross validation on data with a given classifier.

 Given n separate labeled data sets, trains classifier using n-1 data sets, test on
 remaining one.  Average results over all n such runs.  Shows overall results in average
 confusion matrix.

 The classifier is passed in as a parameter.  For this to work the classifier (clf) must
 follow certain conventions.  The conventions are as follows:
   1) The following must initialize the clf ('p' is the dimension of the data):
       clf = clfinit( p, clfparams{:} ) 
   2) The created clf must point to 2 functions for training and applying it:
       clf.fun_train    and   clf.fun_fwd
   3) For training the following will be called:
       clf = clf.fun_train( clf, X, Y );
   4) For testing the following will be called:
       pred = clf.fun_fwd( clf, Xtest );
 The format for X is nxp where n is the number of data points and p is their dimension.
 The format for Y is nx1.  Example of a classifier is: clfinit = @clf_lda

 Given data in a cell array format, it might be useful to string out into single array:
   IDX = cell2mat( permute( IDX, [2 1] ) );  data = cell2mat( permute( data, [2 1] ) );
 For a simple, small dataset, can do the following to do leave one out classification:
   [n,p]=size(data); IDX=mat2cell(IDX,ones(1,n),1);  data=mat2cell(data,ones(1,n),p);

 Overall error can be calculated via:
   er = 1-sum(diag(CM))/sum(CM(:))
 Normalized confusion matrix can be calculated via:
   CMn = CM ./ repmat( sum(CM,2), [1 size(CM,2)] );

 INPUTS
   data        - cell array of (n x p) arrays each of n samples of dim p
   IDX         - cell array of (n x 1) arrays each of n labels
   clfinit     - classifier initialization function
   clfparams   - classifier parameters 
   types       - [optional] cell array of string labels for types
   ignoretypes - [optional] array of types we aren't interested in {eg: [1 4 5]}.
   fname       - [optional] specify a file to save CM to, as well as image
   show        - [optional] will display results in figure(show) 

 OUTPUTS
   CM          - confusion matrix

 EXAMPLE
   load clf_data; % 2 class data
   nfoldxval( data, IDX, @clf_lda,{'linear'}, [],[],[],1 );      % LDA
   nfoldxval( data, IDX, @clf_knn,{4},[],[],[],2 );              % 4 k nearest neighbor
   nfoldxval( data, IDX, @clf_svm,{'poly',2},[],[],[],3 );       % polynomial SVM
   nfoldxval( data, IDX, @clf_svm,{'rbf',2^-12},[],[],[],4 );    % rbf SVM
   nfoldxval( data, IDX, @clf_dectree,{},[],[],[],5 );           % decision tree
   % for multi-class data
   nfoldxval( data, IDX, @clf_ecoc,{@clf_svm,{'rbf',2^-12},nclasses},[],[],[],6 ); % ECOC

 DATESTAMP
   11-Oct-2005  2:45pm

 See also CLF_LDA, CLF_KNN, CLF_SVM, CLF_ECOC

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Wed 03-May-2006 23:48:50 by m2html © 2003