Difference between revisions of "Knn"

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
(Importing text file)
 
(18 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
 
===Purpose===
 
===Purpose===
  
Line 7: Line 6:
  
 
:pclass = knn(xref,xtest,k,options); %make prediction without model
 
:pclass = knn(xref,xtest,k,options); %make prediction without model
 
 
:pclass = knn(xref,xtest,options); %use default k
 
:pclass = knn(xref,xtest,options); %use default k
 
 
:model = knn(xref,k,options) %create model
 
:model = knn(xref,k,options) %create model
 +
:modelp = knn(xref,model,k,options) %apply model to xtest
 +
:modelp = knn(xtest,model,options)  %apply model to xtest; predictions (equivalent to pclass) in modelp.classification.mostprobable.
 +
:[pclass,closest,votes] = knn(xref,xtest,k,options);  %make prediction without model
 +
:[pclass,closest,votes] = knn(xref,xtest,options);    %use default k
 +
:[pclass,closest,votes] = knn(xref,k,options);  %self-prediction without model
 +
: knn  % Launches an Analysis window with KNN as the selected method.
  
:pclass = knn(xref,xtest,k,options) %apply model to xtest
+
Please note that the recommended way to build and apply a K-nearest neighbor model from the command line is to use the Model Object. Please see [[EVRIModel_Objects | this wiki page on building and applying models using the Model Object]].
 
 
:pclass = knn(xtest,model,options)
 
  
 
===Description===
 
===Description===
Line 20: Line 21:
 
Performs kNN classification where the "k" closest samples in a reference set vote on the class of an unknown sample based on distance to the reference samples. If no majority is found, the unknown is assigned the class of the closest sample (see input options for other no-majority behaviors).
 
Performs kNN classification where the "k" closest samples in a reference set vote on the class of an unknown sample based on distance to the reference samples. If no majority is found, the unknown is assigned the class of the closest sample (see input options for other no-majority behaviors).
  
====INPUTS====
+
====Inputs====
  
 
* '''xref''' = a DataSet object of reference data,
 
* '''xref''' = a DataSet object of reference data,
Line 26: Line 27:
 
* '''xtest''' = a DataSet object or Double containing the unknown test data.
 
* '''xtest''' = a DataSet object or Double containing the unknown test data.
  
====OPTIONAL INPUTS====
+
====Optional Inputs====
  
 
* '''''model''' '' = an optional standard KNN model structure which can be passed instead of xref (note order of inputs: (xtest,model) ) to apply model to test data.
 
* '''''model''' '' = an optional standard KNN model structure which can be passed instead of xref (note order of inputs: (xtest,model) ) to apply model to test data.
  
*'''''''' k '' = number of components {default = rank of X-block}.
+
* '''k''' = number of components {default = rank of X-block}.
  
====OUTPUTS====
+
====Outputs====
  
* '''pclass''' = an optional number of neighbors to use in vote for class of unknown {default = 3}. If k=1, only the nearest sample will define the class of the unknown.
+
* '''pclass''' = the voted closest class, if a majority of nearest neighbors were of the same class, or the class of the closest sample, if no majority was found (Only returned if xtest is supplied).
  
* '''model''' = if no test data (xtest) is supplied, a standard model structure is returned which can be used with test data in the future to perform a prediction.
+
* '''closest''' = matrix of samples (rows) by closest neighbor index (columns). Will always have k columns indicating which samples were the closest to the given sample (row).
 +
* '''votes''' = maxtix of samples (rows) by class numbers voted for (columns). Will always have k columns indicating which classes were voted for by each nearest neighbor corresponding to closest matrix.
 +
 
 +
* '''model''' = if no test data (xtest) is supplied, a standard model structure is returned which can be used with test data in the future to perform a prediction. Note that information about the classification of X-block samples is available in the  '''classification''' field, described at [[Standard_Model_Structure#model|Standard Model]].
 +
 
 +
For more information on class predictions, see [[Sample Classification Predictions]].
  
 
===Options===
 
===Options===
  
* '''options''' = structure array with the following fields :
+
'''options''' = structure array with the following fields :
  
 
* '''display''': [ 'off' | {'on'} ] governs level of display to screen.
 
* '''display''': [ 'off' | {'on'} ] governs level of display to screen.
 +
 +
* '''waitbar''' : [ 'off' | 'on' |{'auto'}] governs display of a waitbar when classifying. 'on' always shows a waitbar, 'off' never shows a waitbar, 'auto' shows a waitbar only when the data is particularly large.
  
 
* '''preprocessing''': { [ ] } A cell containing a preprocessing structure or  keyword (see PREPROCESS). Use {'autoscale'} to perform autoscaling on reference and test data.
 
* '''preprocessing''': { [ ] } A cell containing a preprocessing structure or  keyword (see PREPROCESS). Use {'autoscale'} to perform autoscaling on reference and test data.
 +
 +
* '''classset'''  : [ 1 ] indicates which class set in xref to use.
  
 
* '''nomajority''': [ 'error' | {'closest'} | class_number ] Behavior when no majority is found in the votes. 'closest' = return class of closest sample. 'error' = give error message. class_number (i.e. any numerical value) = return this value for no-majority votes (e.g. use 0 to return zero for all no-majority votes)
 
* '''nomajority''': [ 'error' | {'closest'} | class_number ] Behavior when no majority is found in the votes. 'closest' = return class of closest sample. 'error' = give error message. class_number (i.e. any numerical value) = return this value for no-majority votes (e.g. use 0 to return zero for all no-majority votes)
 +
 +
* '''strictthreshold''': Probability threshold value to use in strict class assignment, see [[Sample_Classification_Predictions#Class_Pred_Strict]]. Default = 0.5.
  
 
===See Also===
 
===See Also===
  
[[analysis]], [[cluster]], [[plsda]], [[simca]]
+
[[analysis]], [[cluster]], [[dbscan]], [[knnscoredistance]], [[modelselector]], [[plsda]], [[simca]], [[svmda]], [[EVRIModel_Objects]]

Latest revision as of 15:13, 6 February 2020

Purpose

K-nearest neighbor classifier.

Synopsis

pclass = knn(xref,xtest,k,options); %make prediction without model
pclass = knn(xref,xtest,options); %use default k
model = knn(xref,k,options) %create model
modelp = knn(xref,model,k,options) %apply model to xtest
modelp = knn(xtest,model,options) %apply model to xtest; predictions (equivalent to pclass) in modelp.classification.mostprobable.
[pclass,closest,votes] = knn(xref,xtest,k,options); %make prediction without model
[pclass,closest,votes] = knn(xref,xtest,options); %use default k
[pclass,closest,votes] = knn(xref,k,options); %self-prediction without model
knn % Launches an Analysis window with KNN as the selected method.

Please note that the recommended way to build and apply a K-nearest neighbor model from the command line is to use the Model Object. Please see this wiki page on building and applying models using the Model Object.

Description

Performs kNN classification where the "k" closest samples in a reference set vote on the class of an unknown sample based on distance to the reference samples. If no majority is found, the unknown is assigned the class of the closest sample (see input options for other no-majority behaviors).

Inputs

  • xref = a DataSet object of reference data,
  • xtest = a DataSet object or Double containing the unknown test data.

Optional Inputs

  • model = an optional standard KNN model structure which can be passed instead of xref (note order of inputs: (xtest,model) ) to apply model to test data.
  • k = number of components {default = rank of X-block}.

Outputs

  • pclass = the voted closest class, if a majority of nearest neighbors were of the same class, or the class of the closest sample, if no majority was found (Only returned if xtest is supplied).
  • closest = matrix of samples (rows) by closest neighbor index (columns). Will always have k columns indicating which samples were the closest to the given sample (row).
  • votes = maxtix of samples (rows) by class numbers voted for (columns). Will always have k columns indicating which classes were voted for by each nearest neighbor corresponding to closest matrix.
  • model = if no test data (xtest) is supplied, a standard model structure is returned which can be used with test data in the future to perform a prediction. Note that information about the classification of X-block samples is available in the classification field, described at Standard Model.

For more information on class predictions, see Sample Classification Predictions.

Options

options = structure array with the following fields :

  • display: [ 'off' | {'on'} ] governs level of display to screen.
  • waitbar : [ 'off' | 'on' |{'auto'}] governs display of a waitbar when classifying. 'on' always shows a waitbar, 'off' never shows a waitbar, 'auto' shows a waitbar only when the data is particularly large.
  • preprocessing: { [ ] } A cell containing a preprocessing structure or keyword (see PREPROCESS). Use {'autoscale'} to perform autoscaling on reference and test data.
  • classset : [ 1 ] indicates which class set in xref to use.
  • nomajority: [ 'error' | {'closest'} | class_number ] Behavior when no majority is found in the votes. 'closest' = return class of closest sample. 'error' = give error message. class_number (i.e. any numerical value) = return this value for no-majority votes (e.g. use 0 to return zero for all no-majority votes)

See Also

analysis, cluster, dbscan, knnscoredistance, modelselector, plsda, simca, svmda, EVRIModel_Objects