SVM Function Settings

From Eigenvector Research Documentation Wiki
Revision as of 00:45, 19 February 2010 by imported>Donal (→‎Kernel Type)
Jump to navigation Jump to search

Support Vector Machines

SVMs are non-linear models which can be used for regression or classification problems. The following settings can be access from the SVM Function Settings panel in the Analysis GUI or from the Options GUI.


svm function settings

SVM Type

Classification (SVMDA)

  • Nu-SVM optimizes a model with an adjustable parameter Nu (0 -> 1] which indicates the upper bound on the number of misclassifications allowed.
  • C-SVC optimizes a model with an adjustable cost function C [0 -> inf] which indicates how strongly misclassifications should be penalized.

Regression (SVM)

(shown above)

  • Epsilon-SVR optimizes a model using the adjustable parameters epsilon (upper tolerance on prediction errors) and C (cost of prediction errors larger than epsilon.)
  • Nu-SVR optimizes a model using the adjustable parameter Nu (0 -> 1] which indicates a lower bound on the number of support vectors to use given as a fraction of total calibration samples.

Kernel Type

Two kernel types are supported, Linear and Gaussian Radial Basis Function (RBF). Linear kernel has no parameters but the RBF has a ‘gamma’ parameter (> 0). This parameter if available through the Options GUI.

Compression

X-block data may be compressed by applying PCA or PLS to the data to reduce the number of X-block variables. SVM is then applied to the resulting scores values. The number of latent variables (principal components) used in the compression model must be supplied. The compression model is included as part of the SVM model and will be automatically applied to new data.

Probability Estimates

For a given sample, estimates the per-class probability of the sample’s class membership. Results can be seen in the Scores Plot. This feature is only available for classification (SVMDA) type models.

Cross-validation

Performs n-fold cross validation. Data are randomly divided into n groups. Each group is excluded in turn and an svm trained on the remaining groups (which are separately preprocessed) and validated against the excluded group. Note that cross-validation results from this algorithm can be notably different from the cross-validation results from other model types and care should be taken in comparing results. It is strongly recommended that a separate validation set be used for comparison to other model types.