Lregda: Difference between revisions
No edit summary |
|||
Line 83: | Line 83: | ||
===See Also=== | ===See Also=== | ||
[[analysis]], [[crossval]], [[preprocess]], [[EVRIModel_Objects]] | [[analysis]], [[crossval]], [[preprocess]], [[lreg]], [[EVRIModel_Objects]] |
Latest revision as of 12:05, 13 September 2020
Purpose
Predictions based on Logistic Regression (LREGDA) classification models. LREGDA Logistic Regression for classification.
Synopsis
- lregda - Launches an Analysis window with LREGDA as the selected method.
- [model] = lregda(x,options);
- [model] = lregda(x,y,options);
- [pred] = lregda(x,model,options);
- [valid] = lregda(x,y,model,options);
- [options] = lregda('options');
Please note that the recommended way to build and apply an LREGDA model from the command line is to use the Model Object. Please see this wiki page on building and applying models using the Model Object.
Description
Build an LREGDA model from input dataset X, or input X and Y if classes are in Y, using the specified algorithm and regularization parameter. Alternatively, if a model is input then LREGDA makes a prediction for an input test X block. The LREGDA model contains quantities (hypothesis coefficients) calculated from the calibration data. When a model structure is passed in to LREGDA then these weights do not need to be re-calculated.
LREGDA solves for the logistic regression model parameters using the minFunc software:
- M. Schmidt. minFunc: unconstrained differentiable multivariate optimization in Matlab. http://www.cs.ubc.ca/~schmidtm/Software/minFunc.html, 2005.
Inputs
- x = X-block (predictor block) class "double" or "dataset", containing numeric values,
- y = Y-block (optional) class "double" sample class values,
- model = previously generated model (when applying model to new data).
Outputs
- model = a standard model structure model with the following fields (see Standard Model Structure):
- modeltype: 'LREGDA',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- pred: 2 element cell array with
- model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
- detail: sub-structure with additional model details and results, including:
- model.detail.lreg: Structure containing 'lreg' matrix of model coefficients.
- pred a structure, similar to model for the new data.
Algorithm
The 'algorithm' option allows selection of Logistic Regression wit no regularization ('none'), L2 regularization ('ridge'), L1 regularization ('lasso'), or equally weighted L1 and L2 regularization ('elasticnet').
Cross-validation
Cross-validation can be applied to LREGDA when using either the LREGDA Analysis window or the command line. From the Analysis window specify the cross-validation method in the usual way (clicking on the model icon's red check-mark, or the "Choose Cross-Validation" link in the flowchart).
Options
options = a structure array with the following fields:
- display : [ 'off' |{'on'}] Governs display
- plots: [ {'none'} | 'final' ] governs plotting of results.
- blockdetails : [ {'standard'} | 'all' ] extent of detail included in model. 'standard' keeps only y-block, 'all' keeps both x- and y- blocks.
- waitbar : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period.
- algorithm : [ {'ridge'} | 'none' | 'lasso' | 'elasticnet'] specify the LREG implementation to use:
- 'none' has no regularization,
- 'ridge' uses L2 regularization,
- 'lasso' uses L1 regularization, and
- 'elasticnet' uses equally weighted L1 and L2 regularization..
- maxIter : [400] Maximum number of iterations allowed in the minFunc optimization solver.
- lambda : [{0.1}] Regularization parameter
- preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
- compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the LREGDA model. 'pca' uses a simple PCA model to compress the information. 'pls' uses a pls model. Compression can make the LREGDA more stable and less prone to overfitting.
- compressncomp: [1] Number of latent variables (or principal components to include in the compression model.
- compressmd: [{'yes'} | 'no'] Use Mahalnobis Distance corrected.
- cvmethod : [{'con'} | 'vet' | 'loo' | 'rnd'] CV method, OR [] for Kennard-Stone single split.
- cvsplits : [{5}] Number of CV subsets.
- cvi : M element vector with integer elements allowing user defined subsets. (cvi) is a vector with the same number of elements as x has rows i.e., length(cvi) = size(x,1). Each cvi(i) is defined as:
- cvi(i) = -2 the sample is always in the test set.
- cvi(i) = -1 the sample is always in the calibration set,
- cvi(i) = 0 the sample is always never used, and
- cvi(i) = 1,2,3... defines each test subset.
Usage from LREGDA Analysis window
See the lregdademo.m function for command line usage example.