Mlr: Difference between revisions

Revision as of 09:50, 7 April 2022

Purpose

Multiple Linear Regression for multivariate Y.

Synopsis

model = mlr(x,y,options)

pred = mlr(x,model,options)

valid = mlr(x,y,model,options)

mlr % Launches analysis window with MLR as the selected method.

Please note that the recommended way to build and apply a MLR model from the command line is to use the Model Object. Please see this wiki page on building and applying models using the Model Object.

Description

MLR identifies models of the form Xb = y + e.

Inputs

y = X-block: predictor block (2-way array or DataSet Object)

y = Y-block: predictor block (2-way array or DataSet Object)

Outputs

model = scalar, estimate of filtered data.

pred = structure array with predictions

valid = structure array with predictions

Options

options = a structure array with the following fields.

display: [ {'off'} | 'on'] Governs screen display to command line.

plots: [ 'none' | {'final'} ] governs level of plotting.

algorithm: [ {'leastsquares'} | 'ridge' | 'lasso' | 'elasticnet' ] Governs the level of regularization used when calculating the regression vector. 'ridge' uses the L2 penalty, 'lasso' uses the L1 penalty', and 'elasticnet' uses both L1 and L2.

ridge: [ 0 ] Value(s) for the ridge parameter to use in regularizing the inverse for ridge or elasticnet regression [ridge >=0 ].

lasso: [ ] Value(s) for the lasso parameter to use in regularizing the inverse for lasso or elasticnet regression [lasso >=0 ].

preprocessing: { [] [] } preprocessing structure (see PREPROCESS).

blockdetails: [ 'compact' | {'standard'} | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model.

‘Standard’ = the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'.
‘Compact’ = for this function, 'compact' is identical to 'standard'.
'All' = keep predictions, raw residuals for both X- & Y-blocks as well as the X- & Y-blocks themselves.

MLR Regularization

Starting in PLS_Toolbox/Solo 9.1, we have added the ability to incorporate regularization when calculating the regression vector. Adding regularization to a Linear Regression model is known to reduce calibration overfitting. Ridge regularization was able to be performed before 9.1, but to a limited degree. MLR now can do Lasso and Elasticnet regularization in addition to Ridge. Ridge regression uses the L2 penalty as a part of minimizing the cost function, while Lasso uses the L1 penalty and Elasticnet uses a combination of both L1 & L2. You will see that the ridge and lasso fields in the method options structure contains an array of different values. What this means is that when using MLR with regularization is that each of these values in the ranges are tested in order to find the single best penalty value, creating the best corresponding regression vector. This is tested by doing a small random subset cross-validation on the calibration data, and the penalty that gives the best score from this cross-validation is to be noted as the best penalty. The best penalty/penalties can be found under model.detail.mlr.best_params (note that these will be empty if no regularization is used). In order to maximize speed in the model-building process, the Parallel Computing Toolbox (PCT) is used to do the model fitting between each of the penalty values. While PCT is nice to have, it is not required for MLR regularization.

In MLR, our hypothesis function is

$h_{\theta }(x)=\theta ^{T}X$

where $\theta$ is the regression vector we wish to calculate.

The loss function for MLR is

$L(h_{\theta }(x_{i}),y)={\frac {1}{2}}(h_{\theta }(x_{i})-y_{i})^{2}.$

The cost function $J_{\theta }$ is used to minimize the loss $L(h_{\theta }(x_{i}),y)$ , and is found by the following equation

$J_{\theta }={\frac {1}{m}}\sum _{i=1}^{m}L(h_{\theta }(x_{i}),y)$

But when it comes to regularization, the $J_{\theta }$ is differed by the incorporation of the penalty terms. Consult the table below to note the differences in the cost functions between each algorithm:

options.algorithm	$J_{\theta }$ (Cost Function)
'leastsquares'	${\frac {1}{m}}\sum _{i=1}^{m}L(h_{\theta }(x_{i}),y)$
'ridge'	${\frac {1}{m}}\sum _{i=1}^{m}L(h_{\theta }(x_{i}),y)+{\frac {\lambda _{a}}{2m}}\sum _{j=1}^{n}\theta _{j}^{2}$
'lasso'	${\frac {1}{m}}\sum _{i=1}^{m}L(h_{\theta }(x_{i}),y)+{\frac {\lambda _{b}}{2m}}\sum _{j=1}^{n}\|\theta _{j}\|$
'elasticnet'	${\frac {1}{m}}\sum _{i=1}^{m}L(h_{\theta }(x_{i}),y)+{\frac {\lambda _{b}}{2m}}\sum _{j=1}^{n}\|\theta _{j}\|+{\frac {\lambda _{a}}{2m}}\sum _{j=1}^{n}\theta _{j}^{2}$

Note: $\lambda _{a}$ pertains to the L2 penalty value and $\lambda _{b}$ pertains to the L1 penalty value.

Studentized Residuals

From version 8.8 onwards, the Studentized Residuals shown for MLR Scores Plot are now calculated for calibration samples as:

 MSE   = sum((res).^2)./(m-1);
 syres = res./sqrt(MSE.*(1-L));

where res = y residual, m = number of samples, and L = sample leverage. This represents a constant multiplier change from how Studentized Residuals were previously calculated. For test datasets, where pres = predicted y residual, the semi-Studentized residuals are calculated as:

 MSE   = sum((res).^2)./(m-1);
 syres = pres./sqrt(MSE);

This represents a constant multiplier change from how the semi-Studentized Residuals were previously calculated.

@@ Line 76: / Line 76: @@
 | 'leastsquares' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y) </math>
 |-
-| 'ridge' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y)  + \frac{\lambda}{2m}\sum^{n}_{j=1}\theta_j^2</math>
+| 'ridge' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y)  + \frac{\lambda_a}{2m}\sum^{n}_{j=1}\theta_j^2</math>
 |-
-| 'lasso' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y)  + \frac{\lambda}{2m}\sum^{n}_{j=1}|\theta_j|</math>
+| 'lasso' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y)  + \frac{\lambda_b}{2m}\sum^{n}_{j=1}|\theta_j|</math>
 |-
-| 'elasticnet' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y)  + \frac{\lambda}{2m}\sum^{n}_{j=1}|\theta_j| + \frac{\lambda}{2m}\sum^{n}_{j=1}\theta_j^2</math>
+| 'elasticnet' || <math> \frac{1}{m} \sum^{m}_{i=1} L(h_\theta(x_i),y)  + \frac{\lambda_b}{2m}\sum^{n}_{j=1}|\theta_j| + \frac{\lambda_a}{2m}\sum^{n}_{j=1}\theta_j^2</math>
 |}
+Note: <math> \lambda_a </math> pertains to the L2 penalty value and <math> \lambda_b </math> pertains to the L1 penalty value.
 ====Studentized Residuals====

Mlr: Difference between revisions

Revision as of 09:50, 7 April 2022

Contents

Purpose

Synopsis

Description

Inputs

Outputs

Options

MLR Regularization

Studentized Residuals

See Also

Navigation menu

Mlr: Difference between revisions

Revision as of 09:50, 7 April 2022

Purpose

Synopsis

Description

Inputs

Outputs

Options

MLR Regularization

Studentized Residuals

See Also

Navigation menu

Search