Estimatefactors: Difference between revisions
imported>Jeremy No edit summary |
imported>Bob No edit summary |
||
Line 9: | Line 9: | ||
===Description=== | ===Description=== | ||
Given a bilinear dataset, ESTIMATEFACTORS estimates the number of | Given a bilinear dataset, ESTIMATEFACTORS estimates the number of significant factors required to describe the data. The algorithm uses PCA bootstrapping (resampling) of the data. The PCA loadings determined for each resampling are compared for changes. Principal components which change significantly from one resampling to the next are probably due mostly to noise rather than signal. | ||
The output is an estimate of the signal to | The output is an estimate of the signal to noise ratio for each principal component. Ratios of 2 or below are dominated by noise, above 3 are OK, and between 2 and 3 are a jugement call. The number of factors needed to describe the data is the number of eigenvectors with signal to noise ratios greater than about 2. | ||
This function is based on an algorithm developed and Copyrighted 1997 by Ronald C. Henry, Eun Sug Park, and Clifford H. Spiegelman and used by permission of the authors. For reference see: | This function is based on an algorithm developed and Copyrighted 1997 by Ronald C. Henry, Eun Sug Park, and Clifford H. Spiegelman and used by permission of the authors. For reference see: | ||
Line 18: | Line 18: | ||
* Park, E.S., Henry, R.C., & Spiegelman C.H. (2000). Estimating The Number Of Factors To Include In A Height Dimensional Multivaraite Bilinear Model. Communications in Statistics-Theory and Methods, 29(3), 723-746. | * Park, E.S., Henry, R.C., & Spiegelman C.H. (2000). Estimating The Number Of Factors To Include In A Height Dimensional Multivaraite Bilinear Model. Communications in Statistics-Theory and Methods, 29(3), 723-746. | ||
====Inputs==== | |||
* '''x''' = bilinear data, either in the form of a double array or dataset object | |||
====Outputs==== | |||
* '''S''' = vector containing an estimate of the signal to noise ratio for each principal component | |||
===Options=== | ===Options=== | ||
Line 35: | Line 43: | ||
===See Also=== | ===See Also=== | ||
[[pca]], [[pcaengine]] | [[pca]], [[pcaengine]], [[preprocess]] |
Revision as of 06:42, 9 October 2008
Purpose
Estimate number of significant factors in multivariate data.
Synopsis
- S = estimatefactors (x,options)
Description
Given a bilinear dataset, ESTIMATEFACTORS estimates the number of significant factors required to describe the data. The algorithm uses PCA bootstrapping (resampling) of the data. The PCA loadings determined for each resampling are compared for changes. Principal components which change significantly from one resampling to the next are probably due mostly to noise rather than signal.
The output is an estimate of the signal to noise ratio for each principal component. Ratios of 2 or below are dominated by noise, above 3 are OK, and between 2 and 3 are a jugement call. The number of factors needed to describe the data is the number of eigenvectors with signal to noise ratios greater than about 2.
This function is based on an algorithm developed and Copyrighted 1997 by Ronald C. Henry, Eun Sug Park, and Clifford H. Spiegelman and used by permission of the authors. For reference see:
- Henry, R.C., Park, E.S., & Spiegelman, C.H. (1999). Comparing A New Algorithm With The Classic Methods For Estimating The Number Of Factors. Chemometrics and Intelligent Laboratory Systems, 48(1), 91-97.
- Park, E.S., Henry, R.C., & Spiegelman C.H. (2000). Estimating The Number Of Factors To Include In A Height Dimensional Multivaraite Bilinear Model. Communications in Statistics-Theory and Methods, 29(3), 723-746.
Inputs
- x = bilinear data, either in the form of a double array or dataset object
Outputs
- S = vector containing an estimate of the signal to noise ratio for each principal component
Options
options = a structure array with the following fields:
- plots: ['none' | {'final'} ] Governs plotting.
- resample: [ {42} ] number of times the data is to be resampled. Generally, values of 40 or 50 are sufficient. Values greater than several hundred are not required.
- maxfactors: [ {30} ] maximum number of factors to plot (if plots are selected by options.plots).
- preprocessing: {[]} Preprocessing structure or keyword (see PREPROCESS), to apply before analyzing data.
The default options can be retreived using: options = estimatefactors('options');.