Replacevars: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
(Importing text file)
imported>Chuck
No edit summary
Line 1: Line 1:
===Purpose===
===Purpose===


Replace variables based on principal component analysis (PCA) or partial least squares (PLS) regression models.
Replace variables based on factor-based models.


===Synopsis===
===Synopsis===


:rm = replace(model,vars)
:repdata = replace(model,vars,data);
:[rm,repdata] = replace(model,vars,data)
:repdata = replace(model,data);
:repdata = replace(model,data)
:rm = replace(model,vars);
:rm = replace(model,vars,rmtype);


===Description===
===Description===
This function generates a matrix (or matrices) that can be used to replace "bad" variables from data matrices with the values that are most consistent with a given factor-based model (PCA, PLS, PCR, CLS, MCR, etc.). The outputs of this function are dependent on the inputs provided:
:* If a model and a list of variables to replace are provided, then the function returns the matrix or matrices required to do the mathematical replacement operation
:* If the data on which to operate is also provided, then the function returns the data with the replaced values.


REPLACE replaces variables from data matrices with values most consistent with the given PCA or PLS model. Input model can be any of the following:
REPLACE replaces variables from data matrices with values most consistent with the given PCA or PLS model. Input model can be any of the following:
Line 25: Line 30:


When vars in input, the outputs are the replacement matrix rm and the replaced data (if data was provided), repdata. Multiplication of a data matrix xnew by rm will replace variables with values most consistent with the given PCA or PLS model. If vars was not supplied, only repdata is output.  
When vars in input, the outputs are the replacement matrix rm and the replaced data (if data was provided), repdata. Multiplication of a data matrix xnew by rm will replace variables with values most consistent with the given PCA or PLS model. If vars was not supplied, only repdata is output.  
====Inputs====
* '''model''' = can be one of the following:
**(1) a model structure generated by the [[pca]] or [[pls]] functions or [[analysis]]
**(2) a set of loadings (column) vectors
**(3) the residuals-generating matrix I-PP'
**(4) a PLS model like the ones generated by the function [[plsrsgn]] or [[plsrsgcv]].
* '''data'''  = a matrix or DataSet Object in which the specified variables are to be replaced. If <tt>vars</tt> is omitted, <tt>data</tt> is searched for non-finite values (NaN or Inf) and these are replaced. When data is supplied, only the replaced data is returned <tt>repdata</tt>.
====Optional Inputs====
* '''vars''' = an optional row vector containing the column indices of the variables to be replaced.
* '''rmtype''' = an optional string indicating the type of replacement matrix to output (only valid when next output is omitted).
** '''matrix''' = returns an entire <tt>rm</tt> matrix
** '''loads'''  = (default) returns the pseudo-inverse of the loadings.
See below for details on these outputs.
====Outputs====
* '''repdata''' = The data with the chosen variables replaced.
* '''rm''' = This output can be used with data to replace the variables indicated by <tt>vars</tt? with the values which are most consistent with the given model. This output is only available when <tt>data</tt> is not provided.
** If input <tt>rmtype</tt> is not provided or is 'loads', then <tt>rm</tt> will be a structure containing three matrices which can be used to replace varibles using:
::data(:,rm.vars) = data * rm.ploads * rm.loads
** If <tt>rmtype</tt> is 'matrix' then <tt>rm</tt> will be a matrix which can be used to replace variables using:
::data = data * rm
::'''Note:''' The 'matrix' form of <tt>rm</tt> will nearly always require a significantly larger amount of memory, but can be applied easier.


===Examples===
===Examples===


A PCA model was created on a data matrix xold giving a model structure model. The loadings, a set of loadings column vectors, were extracted to a variable loads using loads = model.loads{2};. It was found that the sensor measuring variable 9 has gone "bad" and we would like to replace it in the new data matrix xnew. A replacement matrix rm is first created using replace.
A PCA model was created on a data matrix <tt>xold</tt> giving a model structure <tt>model</tt>. The loadings, a set of loadings column vectors, were extracted to a variable <tt>loads</tt> using
 
:loads = model.loads{2};.
 
It was found that the sensor measuring variable 9 has gone "bad" and we would like to replace it in the new data matrix <tt>xnew</tt>. A replacement matrix <tt>rm</tt> is first created using REPLACE.


rm = replace(loads,9);
:rm = replace(loads,9);


The new data matrix with variable 9 replaced rxnew is then calculated by multiplying xnew by rm.
The new data matrix with variable 9 replaced <tt>rxnew</tt> is then calculated by multiplying <tt>xnew</tt> by <tt>rm</tt>.


rxnew = xnew\*rm;
:rxnew = xnew\*rm;


===See Also===
===See Also===


[[mdcheck]], [[pca]], [[plsrsgcv]], [[plsrsgn]]
[[mdcheck]], [[pca]], [[plsrsgcv]], [[plsrsgn]]

Revision as of 12:40, 9 October 2008

Purpose

Replace variables based on factor-based models.

Synopsis

repdata = replace(model,vars,data);
repdata = replace(model,data);
rm = replace(model,vars);
rm = replace(model,vars,rmtype);

Description

This function generates a matrix (or matrices) that can be used to replace "bad" variables from data matrices with the values that are most consistent with a given factor-based model (PCA, PLS, PCR, CLS, MCR, etc.). The outputs of this function are dependent on the inputs provided:

  • If a model and a list of variables to replace are provided, then the function returns the matrix or matrices required to do the mathematical replacement operation
  • If the data on which to operate is also provided, then the function returns the data with the replaced values.


REPLACE replaces variables from data matrices with values most consistent with the given PCA or PLS model. Input model can be any of the following:

1) a standard model structure generated by the PCA or PLS functions or the Anlysis GUI

2) a set of loading column vectors (e.g., loads returned by the pca routine, or model.loads{2} if the output is a model structure)

3) the PCA residual generating matrix (I-loads\*loads�), or

4) the PLS residuals generating matrix coeff returned by the plsrsgn routine.

Optional input vars is a row vector containing the indices of the variables (columns) to be replaced. If omitted, the input data is searched for non-finite values (NaN, Inf) and these values are replaced.

When vars in input, the outputs are the replacement matrix rm and the replaced data (if data was provided), repdata. Multiplication of a data matrix xnew by rm will replace variables with values most consistent with the given PCA or PLS model. If vars was not supplied, only repdata is output.

Inputs

  • model = can be one of the following:
    • (1) a model structure generated by the pca or pls functions or analysis
    • (2) a set of loadings (column) vectors
    • (3) the residuals-generating matrix I-PP'
    • (4) a PLS model like the ones generated by the function plsrsgn or plsrsgcv.
  • data = a matrix or DataSet Object in which the specified variables are to be replaced. If vars is omitted, data is searched for non-finite values (NaN or Inf) and these are replaced. When data is supplied, only the replaced data is returned repdata.

Optional Inputs

  • vars = an optional row vector containing the column indices of the variables to be replaced.
  • rmtype = an optional string indicating the type of replacement matrix to output (only valid when next output is omitted).
    • matrix = returns an entire rm matrix
    • loads = (default) returns the pseudo-inverse of the loadings.

See below for details on these outputs.

Outputs

  • repdata = The data with the chosen variables replaced.
  • rm = This output can be used with data to replace the variables indicated by vars</tt? with the values which are most consistent with the given model. This output is only available when data is not provided.
    • If input rmtype is not provided or is 'loads', then rm will be a structure containing three matrices which can be used to replace varibles using:
data(:,rm.vars) = data * rm.ploads * rm.loads
    • If rmtype is 'matrix' then rm will be a matrix which can be used to replace variables using:
data = data * rm
Note: The 'matrix' form of rm will nearly always require a significantly larger amount of memory, but can be applied easier.

Examples

A PCA model was created on a data matrix xold giving a model structure model. The loadings, a set of loadings column vectors, were extracted to a variable loads using

loads = model.loads{2};.

It was found that the sensor measuring variable 9 has gone "bad" and we would like to replace it in the new data matrix xnew. A replacement matrix rm is first created using REPLACE.

rm = replace(loads,9);

The new data matrix with variable 9 replaced rxnew is then calculated by multiplying xnew by rm.

rxnew = xnew\*rm;

See Also

mdcheck, pca, plsrsgcv, plsrsgn