Svmda: Difference between revisions
imported>Jeremy (Created page with ''''WARNING:''' Placeholder page for SVMDA ===Purpose=== Partial least squares discriminant analysis. ===Synopsis=== :model = plsda(x,y,ncomp,''options'') :model = plsda(x,nco…') |
imported>Donal No edit summary |
||
Line 1: | Line 1: | ||
===Purpose=== | ===Purpose=== | ||
SVMDA Support Vector Machine (LIBSVM) for classification. | |||
===Synopsis=== | ===Synopsis=== | ||
:model = | :model = svmda(x,y,options); %identifies model (calibration step) | ||
:pred = svmda(x,model,options); %makes predictions with a new X-block | |||
:pred = | :pred = svmda(x,y,model,options); %performs a "test" call with a new X-block and known y-values | ||
: | |||
===Description=== | ===Description=== | ||
SVMDA performs calibration and application of Support Vector Machine (SVM) classification models. (Please see the svm function for support vector machine regression problems). These are non-linear models which can be used for classification problems. The model consists of a number of support vectors (essentially samples selected from the calibration set) and non-linear model coefficients which define the non-linear mapping of variables in the input x-block to allow prediction of the classification as passed in either as the classes field of the x-block or in a y-block which contains numerical classes. It is recommended that regression be done through the svm function. | |||
Svmda is implemented using the LIBSVM package which provides both cost-support vector regression (C-SVC) and nu-support vector regression (nu-SVC). Linear and Gaussian Radial Basis Function kernel types are supported by this function. | |||
Note: Calling svmda with no inputs starts the graphical user interface (GUI) for this analysis method. | |||
====Inputs==== | |||
*( | * '''x''' = X-block (predictor block) class "double" or "dataset", | ||
* '''y''' = Y-block (predicted block) class "double" or "dataset", | |||
* '''model''' = previously generated model (when applying model to new data). | |||
====Outputs==== | |||
* '''model''' = a standard model structure model with the following fields (see MODELSTRUCT): | |||
** '''modeltype''': 'SVM', | |||
** '''datasource''': structure array with information about input data, | |||
** '''date''': date of creation, | |||
** '''time''': time of creation, | |||
** '''info''': additional model information, | |||
** '''pred''': 2 element cell array with | |||
*** model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) | |||
** '''detail''': sub-structure with additional model details and results, including: | |||
*** model.detail.svm.model: Matlab version of the libsvm svm_model (Java) | |||
*** model.detail.svm.cvscan: results of CV parameter scan | |||
*** model.detail.svm.outlier: results of outlier detection (one-class svm) | |||
* '''pred''' a structure, similar to '''model''' for the new data. | |||
=== | ===Options=== | ||
''options'' = a structure array with the following fields: | |||
* ''' | * '''display''': [ 'off' | {'on'} ], governs level of display to command window, | ||
* ''' | * '''plots''' [ 'none' | {'final'} ], governs level of plotting, | ||
* | * '''preprocessing''': {[]} preprocessing structures for x block (see PREPROCESS). NOTE that y-block preprocessing is NOT used with SVMs. Any y-preprocessing will be ignored. | ||
* | * '''blockdetails''': [ {'standard'} | 'all' ], extent of predictions and residuals included in model, 'standard' = only y-block, 'all' x- and y-blocks. | ||
* '''algorithm''': [ 'libsvm' ] algorithm to use. libsvm is default and currently only option. | |||
* | * '''kerneltype''': [ 'linear' | {'rbf'} ], SVM kernel to use. 'rbf' is default. | ||
* | * '''svmtype''': [ {'c-svc'} | 'nu-svc' ] Type of SVM to apply. The default is 'c-svc' for classification. | ||
* ''' | * '''probabilityestimates''': [0| {1} ], whether to train the SVR model for probability estimates, 0 or 1 (default 1)" | ||
* '''cvtimelimit''': Set a time limit (seconds) on individual cross-validation sub-calculation when searching over supplied SVM parameter ranges for optimal parameters. Only relevant if parameter ranges are used for SVM parameters such as cost, epsilon, gamma or nu. Default is 2 (seconds); | |||
* '''splits''': Number of subsets to divide data into when applying n-fold cross validation. Default is 5. | |||
* '''gamma''': Value(s) to use for LIBSVM kernel gamma parameter. Default is 15 values from 10^-6 to 10, spaced uniformly in log. | |||
* '''cost''': Value(s) to use for LIBSVM 'c' parameter. Default is 11 values from 10^-3 to 100, spaced uniformly in log. | |||
* '''nu''': Value(s) to use for LIBSVM 'n' parameter (nu of nu-SVC, and nu-SVR). Default is the set of values [0.2, 0.5, 0.8]. | |||
* '''outliernu''': Value to use for nu in LIBSVM's one-class svm outlier detection. (0.05). | |||
===Algorithm=== | |||
Svmda uses the LIBSVM implementation using the user-specified values for the LIBSVM parameters (see ''options'' above). See [http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf] for further details of these options. | |||
== | |||
The default SVMDA parameters cost, nu and gamma have value ranges rather than single values. This svm function uses a search over the grid of appropriate parameters using cross-validation to select the optimal SVM parameter values and builds an SVM model using those values. This is the recommended usage. The user can avoid this grid-search by passing in single values for these parameters, however. | |||
===See Also=== | ===See Also=== | ||
[[ | [[analysis]], [[svm]] |
Revision as of 23:24, 25 January 2010
Purpose
SVMDA Support Vector Machine (LIBSVM) for classification.
Synopsis
- model = svmda(x,y,options); %identifies model (calibration step)
- pred = svmda(x,model,options); %makes predictions with a new X-block
- pred = svmda(x,y,model,options); %performs a "test" call with a new X-block and known y-values
Description
SVMDA performs calibration and application of Support Vector Machine (SVM) classification models. (Please see the svm function for support vector machine regression problems). These are non-linear models which can be used for classification problems. The model consists of a number of support vectors (essentially samples selected from the calibration set) and non-linear model coefficients which define the non-linear mapping of variables in the input x-block to allow prediction of the classification as passed in either as the classes field of the x-block or in a y-block which contains numerical classes. It is recommended that regression be done through the svm function.
Svmda is implemented using the LIBSVM package which provides both cost-support vector regression (C-SVC) and nu-support vector regression (nu-SVC). Linear and Gaussian Radial Basis Function kernel types are supported by this function.
Note: Calling svmda with no inputs starts the graphical user interface (GUI) for this analysis method.
Inputs
- x = X-block (predictor block) class "double" or "dataset",
- y = Y-block (predicted block) class "double" or "dataset",
- model = previously generated model (when applying model to new data).
Outputs
- model = a standard model structure model with the following fields (see MODELSTRUCT):
- modeltype: 'SVM',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- pred: 2 element cell array with
- model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
- detail: sub-structure with additional model details and results, including:
- model.detail.svm.model: Matlab version of the libsvm svm_model (Java)
- model.detail.svm.cvscan: results of CV parameter scan
- model.detail.svm.outlier: results of outlier detection (one-class svm)
- pred a structure, similar to model for the new data.
Options
options = a structure array with the following fields:
- display: [ 'off' | {'on'} ], governs level of display to command window,
- plots [ 'none' | {'final'} ], governs level of plotting,
- preprocessing: {[]} preprocessing structures for x block (see PREPROCESS). NOTE that y-block preprocessing is NOT used with SVMs. Any y-preprocessing will be ignored.
- blockdetails: [ {'standard'} | 'all' ], extent of predictions and residuals included in model, 'standard' = only y-block, 'all' x- and y-blocks.
- algorithm: [ 'libsvm' ] algorithm to use. libsvm is default and currently only option.
- kerneltype: [ 'linear' | {'rbf'} ], SVM kernel to use. 'rbf' is default.
- svmtype: [ {'c-svc'} | 'nu-svc' ] Type of SVM to apply. The default is 'c-svc' for classification.
- probabilityestimates: [0| {1} ], whether to train the SVR model for probability estimates, 0 or 1 (default 1)"
- cvtimelimit: Set a time limit (seconds) on individual cross-validation sub-calculation when searching over supplied SVM parameter ranges for optimal parameters. Only relevant if parameter ranges are used for SVM parameters such as cost, epsilon, gamma or nu. Default is 2 (seconds);
- splits: Number of subsets to divide data into when applying n-fold cross validation. Default is 5.
- gamma: Value(s) to use for LIBSVM kernel gamma parameter. Default is 15 values from 10^-6 to 10, spaced uniformly in log.
- cost: Value(s) to use for LIBSVM 'c' parameter. Default is 11 values from 10^-3 to 100, spaced uniformly in log.
- nu: Value(s) to use for LIBSVM 'n' parameter (nu of nu-SVC, and nu-SVR). Default is the set of values [0.2, 0.5, 0.8].
- outliernu: Value to use for nu in LIBSVM's one-class svm outlier detection. (0.05).
Algorithm
Svmda uses the LIBSVM implementation using the user-specified values for the LIBSVM parameters (see options above). See [1] for further details of these options.
The default SVMDA parameters cost, nu and gamma have value ranges rather than single values. This svm function uses a search over the grid of appropriate parameters using cross-validation to select the optimal SVM parameter values and builds an SVM model using those values. This is the recommended usage. The user can avoid this grid-search by passing in single values for these parameters, however.