Knn

From Eigenvector Research Documentation Wiki
Revision as of 14:49, 18 September 2008 by imported>Scott
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Purpose

K-nearest neighbor classifier.

Synopsis

pclass = knn(xref,xtest,k,options); %make prediction without model
pclass = knn(xref,xtest,options); %use default k
model = knn(xref,k,options) %create model
pclass = knn(xref,xtest,k,options) %apply model to xtest
pclass = knn(xtest,model,options)

Description

Performs kNN classification where the "k" closest samples in a reference set vote on the class of an unknown sample based on distance to the reference samples. If no majority is found, the unknown is assigned the class of the closest sample (see input options for other no-majority behaviors).

Inputs

  • xref = a DataSet object of reference data,
  • xtest = a DataSet object or Double containing the unknown test data.

Optional Inputs

  • model = an optional standard KNN model structure which can be passed instead of xref (note order of inputs: (xtest,model) ) to apply model to test data.
  • k = number of components {default = rank of X-block}.

Outputs

  • pclass = an optional number of neighbors to use in vote for class of unknown {default = 3}. If k=1, only the nearest sample will define the class of the unknown.
  • model = if no test data (xtest) is supplied, a standard model structure is returned which can be used with test data in the future to perform a prediction.

Options

  • options = structure array with the following fields :
  • display: [ 'off' | {'on'} ] governs level of display to screen.
  • preprocessing: { [ ] } A cell containing a preprocessing structure or keyword (see PREPROCESS). Use {'autoscale'} to perform autoscaling on reference and test data.
  • nomajority: [ 'error' | {'closest'} | class_number ] Behavior when no majority is found in the votes. 'closest' = return class of closest sample. 'error' = give error message. class_number (i.e. any numerical value) = return this value for no-majority votes (e.g. use 0 to return zero for all no-majority votes)

See Also

analysis, cluster, plsda, simca