Advanced Preprocessing: Simple Mathematical Operations

From Eigenvector Research Documentation Wiki
Revision as of 00:47, 9 November 2015 by imported>Donal (Log10)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Two preprocessing methods involve simple mathematical operations which are used to linearize or otherwise modify certain kinds of data.

Absolute Value

The absolute value method is used to remove any sign information from the data. Although unusual, this method may be useful following a derivative or other method which creates negative values. Such correction can allow the use of non-negativity constraints, or simply improve interpretability of derivatized spectra. It should be noted, however, that an absolute value following any method which centers data (such as mean- or median-centering) may create a non-linear response and complicate modeling.

There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command abs.

Log10

A base 10 logarithm (that is, ) can be used whenever the response of the data is linear to the function 10X. Since log of negative values will become undefined values (NaN = Not a Number) this operation first sets negative data values to zero. This effect can be avoided by use of an absolute value preprocessing step prior to a Log10 step. A minimum value filter is also used to prevent huge negative log values when x is very small. This is achieved by adding a constant, c, to x before applying log. This constant is removed during the "undo" step. c = 10-5 by default.

There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10 (however this does not set negative values to zero before taking log, and does not use a minimum value filter).

Transmission to Absorbance (log(1/T))

The spectroscopic transformation is often used when data has been collected as transmission (ratio of measured signal relative to incident signal). The transformation converts the signal to "absorbance" but, in general, transforms data which follows the inverse log relationship.

There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10(1./X).

Arithmetic

Apply simple arithmetic operations to all or part of dataset. See arithmetic.