Release Notes Version 8 0: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
imported>Jeremy
 
(15 intermediate revisions by the same user not shown)
Line 6: Line 6:
==New Features in Solo and PLS_Toolbox==
==New Features in Solo and PLS_Toolbox==


===Multi-Block, Model and Data Fusion Tool===
===Multi-Block, and Model and Data Fusion Tool===
* [[multiblocktool|Multiblock Tool]] - Interface to view, manipulate, and join data. Can be used for data and model fusion, or multi-block modeling.
[[multiblocktool|Multiblock Tool]] - Interface to view, manipulate, and join data. Can be used for data and model fusion, or multi-block modeling.
** Join multiple blocks of variables measured on the same samples (alignment based on labels, axis scales, or size).
* Join multiple blocks of variables measured on the same samples (alignment based on labels, axis scales, or size).
** Automatically align and join time-based blocks of data (based on time axis scale).
* Automatically align and join time-based blocks of data (based on time axis scale).
** Optionally build models on one or more blocks and join outputs from those blocks (model fusion).
* Optionally build models on one or more blocks and join outputs from those blocks (model fusion).
** Choose and apply block-specific preprocessing before joining.
* Choose and apply block-specific preprocessing before joining.
** Save [[multiblock]] model to use to join new data, including application of defined preprocessing and models.
* Save [[multiblock]] model to use to join new data, including application of defined preprocessing and models.
** After building model from joined data, Analysis automatically splits loadings into component block segments for ease of interpretation.
* After building model from joined data, Analysis automatically splits loadings into component block segments for ease of interpretation.


===Analysis and Models===
===Analysis and Models===
* [[mlsca|MLSCA]] - Multi-level simultaneous component analysis method added.
* [[mlsca|MLSCA]] - Multi-level simultaneous component analysis method added.
* [[asca|ASCA]] - Add model of residuals to help assess fit.
* Shortcuts to Data Fusion methods [[multiblocktool|Multiblock Tool]] and [[modelselectorgui|Hierarchical Model Builder]]
* Shortcuts to Data Fusion methods [[multiblocktool|Multiblock Tool]] and [[modelselectorgui|Hierarchical Model Builder]]
* Re-designed Analysis and Preprocessing menus for ease-of-use and consistency.
* Re-designed Analysis and Preprocessing menus for ease-of-use and consistency.
* [[ann|ANN]] now supports custom cross-validation.
* [[ann|ANN]] now supports custom cross-validation.
* [[plsda|PLSDA]] variance captured plot now available.
* [[plsda|PLSDA]] variance captured plot now available.
* Better handleing of full-rank PCA [[simca]] sub-models (where Q residuals are zero.)
* Add "Reduced" Q and T^2 statistics for all factor-based models (normalized to confidence limit.)
* Add quick-access to [[Genetic_Algorithms_for_Variable_Selection|Genetic Algorithm variable selection]] from the iPLS, iPLSDA, and Stepwise Selection interfaces.
* [[confusionmatrix]] - Report additional quantities for each class: count, classification error, precision and F1 score.
** Standardize terminology: TP = count, TPR = proportion (rate) for confusion matrix quantities, and labels shown. 
* [[simca]] Better handling of full-rank PCA sub-models (where Q residuals are zero.)
* Nearest neighbor score distance now normalized to maximum calibration value (standard practice for inlier tests.)
* "Data" drill-down button in scores and loadings plots now automatically drills into preprocessed data and X_hat (fit, residuals) data if any plots of those types are already open.
* Add [http://www.eigenvector.com/faq/index.php?id=150 Q2Y and R2Y] to statistics calculated for models (for comparison to other software)


===Plotting===
===Plotting===
* Significantly faster selection "linking".
* Additional context-menu options for managing line width and symbol size.
* Add quick access to class symbol sets in context menu.
* Significantly faster selection display and "linking" between figures.
* Connect Classes and View Classes buttons now have drop-down menus to display options.
* Added [[Compress gaps|Compress X-axis Gaps (click for example)]] toolbar button [[Image:Compressgapsbutton.png]] to remove gaps caused by excluded variables or samples.
* Improve handling of zoom status in newer versions of Matlab.
* Improve handling of zoom status in newer versions of Matlab.
* Better handling of font sizes on different screen sizes and platforms.
* Better handling of font sizes on different screen sizes and platforms.
* Smarter plot style (e.g., scatter vs. bar) assumptions in "automatic" mode.
* Fix shifting control position issues with newer versions of Matlab.
* Fix rearranging of controls issue with newer versions of Matlab.
* Added Compress
* Connect Classes button has drop-down menu to display connection types.
* Add logic to NORMALIZE score distance to maximum calibration value.
* Respect include field when calculating scores limit
* Show knn score distance limits on plotscores plots.


===Importers===
===Importers===
* [[omnicreadr]] - Reads OMNICix HDF5 image files.
* Automatic reconciliation of mixed axis scales when importing multiple files (using [[matchvars]]). Data will automatically include as much of the original data as possible.
* Improved importer behavior with mixed length data using [[matchvars]] during import.
* [[omnicreadr]] New importer for OMNICix HDF5 image files.
* Updated libraries to use new labspec 6 components including 64 bit support.
* [[hjyreadr]] Support for importing on 64-bit Windows systems and for new LabSpec file formats.
* [[xlsreadr]] - Add support for joining sheets in rows using matchvars.
* [[textreadr]] and [[xlsreadr]] Improved handling of multiple file import using graphically-selected parsing options. Options selected on first file/sheet are now used on ALL subsequent files/sheets.
** Add support for using the same graphically-selected parsing options on ALL sheets.


===Preprocessing===
===Preprocessing===
* [[glog]] Generalized Log Transform added to preprocessing options.
* [[glog]] Generalized Log Transform added to preprocessing options.
* [[pqnorm]] Probabilistic Quotient Normalization added to preprocessing options.
* [[pqnorm]] Probabilistic Quotient Normalization added to preprocessing options.
* [[gscale|Group Scale]]
** Added option to disable mean centering of block (scale only).
** Added easier selection of class set to use when identifying blocks.
* Added "Block Variance Scaling" as new method based on [[gscale]]
* [[flucut|EEM Flitering (flucut)]]
** Added better Raman and Rayleigh filtering using interpolation (now the default).
** Added support for blank subtraction (choose one sample as a blank).
* [[glsw]] Clarified how ELS/EMM and EPO options are related
* [[glsw]] Clarified how ELS/EMM and EPO options are related
* Add support for handling missing data in both [[normaliz]] and [[mscorr]] (median only).
* Add support for handling missing data in both [[normaliz]] and [[mscorr]] (median only).
Line 51: Line 64:
* [[modeloptimizergui|Model Optimizer]] - Better handling of numeric data in comparison table, additional statistics, and improved handling of include field.
* [[modeloptimizergui|Model Optimizer]] - Better handling of numeric data in comparison table, additional statistics, and improved handling of include field.
** Add better support for model groupings in PLSDA and SVMDA within model optimizer.
** Add better support for model groupings in PLSDA and SVMDA within model optimizer.
** Add support for more LWR options
** LWR models: Add "Survey" button to Analysis window to automatically survey over a range of "Local Points"
* Better help integration with newer version of Matlab.
* Better help integration with newer version of Matlab.
* [[modelselectorgui|Hierarchical Model Builder]] - Add vertical scrolling.
* [[modelselectorgui|Hierarchical Model Builder]] - Add vertical scrolling.
* [[Trendtool|TrendTool]] Add "maximum between" option for markers: returns the maximum value between two markers (better identifies peak value when the peak shape may change)
===Model Objects===
* Build and change history now captured in history field of [[Model_object| Model Object]].
* Add <tt>.scoredistance</tt> and <tt>.esterror</tt> as virtual properties for models. These properties can now be accessed directly from models in PLS_Toolbox or [[Solo_Predictor_Script_Construction|Solo_Predictor scripts]].


==New Command-line Features and Functions==
==New Command-line Features and Functions==
Line 59: Line 80:


* [[eemoutlier]] - New function for automatically removing outliers in fluorescence PARAFAC models.
* [[eemoutlier]] - New function for automatically removing outliers in fluorescence PARAFAC models.
* [[glog]] Generalized Log variable scaling.
* [[kurtosis]] - Added kurtosis statistic function to distribution fitting toolbox.
* [[mlsca]] Multi-level Simultaneous Component Analysis.
* [[mlsca]] Multi-level Simultaneous Component Analysis.
* [[multiblock]] Create or apply a multiblock model for joining data.
* [[multiblock]] Create or apply a multiblock model for joining data.
* [[pqnorm]] Probability Quotient Normalization for samples.
* [[skewness]] - Added skewness statistic function to distribution fitting toolbox.


===Command-Line Changes===
===Command-Line Changes===
* [[matchvars]] - Add logic to speed up some cases of joins.
* [[mdcheck]] - Allow use of KNN as data replacement method (replace missing data with data from sample(s) which are closest).
* [[jmlimit]] - Better handle degenerate cases when multiple confidence levels are requested (return VECTOR of zeros instead of single zero).
* [[cov_cv]] - Changed from SVDS to SVD.
* [[histaxes]] - Fix for when NaN's are present in data.
* [[als]] - Sort components by variance captured (if no constraints otherwise defining order).
* [[als]] - Sort components by variance captured (if no constraints otherwise defining order).
* [[comparemodels]] - Report the mean (class count weighted) of Classification Error, Precision, F1 Score for classification models.
* [[confusionmatrix]] - Report additional quantities for each class: count, classification error, precision and F1 score..
* [[confusionmatrix]] - Report additional quantities for each class: count, classification error, precision and F1 score..
** Standardize terminology: TP = count, TPR = proportion (rate) for confusion matrix quantities, and labels shown.   
** Standardize terminology: TP = count, TPR = proportion (rate) for confusion matrix quantities, and labels shown.   
* [[comparemodels]] - Report the mean (class count weighted) of Classification Error, Precision, F1 Score for classification models.
* [[cov_cv]] - Changed from SVDS to SVD to improve behavior with nearly-rank-deficient cases.
* [[crossval]] - Better handling of cross-validation when using PLSDA.
** Convert plsda regression method input to be 'sim' (to speed it up).
** Recognize when user has passed single-column (either logical or class) and force it to be multi-column logical.
* [[histaxes]] - Fix for when NaN's are present in data.
* [[jmlimit]] - Better handle degenerate cases when multiple confidence levels are requested (return VECTOR of zeros instead of single zero).
* [[matchvars]] - Add option to input a cell array of dataset objects which will be joined after reconciling variables to make the least changes in data.
* [[mdcheck]] - Allow use of KNN as data replacement method (replace missing data with data from sample(s) which are closest).

Latest revision as of 14:58, 11 June 2015

Version 8.0 of PLS_Toolbox and Solo was released in June, 2015.

For general product information, see PLS_Toolbox Product Page. For information on Solo, see Solo Product Page.

(back to Release Notes PLS Toolbox and Solo)

New Features in Solo and PLS_Toolbox

Multi-Block, and Model and Data Fusion Tool

Multiblock Tool - Interface to view, manipulate, and join data. Can be used for data and model fusion, or multi-block modeling.

  • Join multiple blocks of variables measured on the same samples (alignment based on labels, axis scales, or size).
  • Automatically align and join time-based blocks of data (based on time axis scale).
  • Optionally build models on one or more blocks and join outputs from those blocks (model fusion).
  • Choose and apply block-specific preprocessing before joining.
  • Save multiblock model to use to join new data, including application of defined preprocessing and models.
  • After building model from joined data, Analysis automatically splits loadings into component block segments for ease of interpretation.

Analysis and Models

  • MLSCA - Multi-level simultaneous component analysis method added.
  • ASCA - Add model of residuals to help assess fit.
  • Shortcuts to Data Fusion methods Multiblock Tool and Hierarchical Model Builder
  • Re-designed Analysis and Preprocessing menus for ease-of-use and consistency.
  • ANN now supports custom cross-validation.
  • PLSDA variance captured plot now available.
  • Add "Reduced" Q and T^2 statistics for all factor-based models (normalized to confidence limit.)
  • Add quick-access to Genetic Algorithm variable selection from the iPLS, iPLSDA, and Stepwise Selection interfaces.
  • confusionmatrix - Report additional quantities for each class: count, classification error, precision and F1 score.
    • Standardize terminology: TP = count, TPR = proportion (rate) for confusion matrix quantities, and labels shown.
  • simca Better handling of full-rank PCA sub-models (where Q residuals are zero.)
  • Nearest neighbor score distance now normalized to maximum calibration value (standard practice for inlier tests.)
  • "Data" drill-down button in scores and loadings plots now automatically drills into preprocessed data and X_hat (fit, residuals) data if any plots of those types are already open.
  • Add Q2Y and R2Y to statistics calculated for models (for comparison to other software)

Plotting

  • Additional context-menu options for managing line width and symbol size.
  • Add quick access to class symbol sets in context menu.
  • Significantly faster selection display and "linking" between figures.
  • Connect Classes and View Classes buttons now have drop-down menus to display options.
  • Added Compress X-axis Gaps (click for example) toolbar button Compressgapsbutton.png to remove gaps caused by excluded variables or samples.
  • Improve handling of zoom status in newer versions of Matlab.
  • Better handling of font sizes on different screen sizes and platforms.
  • Fix shifting control position issues with newer versions of Matlab.

Importers

  • Automatic reconciliation of mixed axis scales when importing multiple files (using matchvars). Data will automatically include as much of the original data as possible.
  • omnicreadr New importer for OMNICix HDF5 image files.
  • hjyreadr Support for importing on 64-bit Windows systems and for new LabSpec file formats.
  • textreadr and xlsreadr Improved handling of multiple file import using graphically-selected parsing options. Options selected on first file/sheet are now used on ALL subsequent files/sheets.

Preprocessing

  • glog Generalized Log Transform added to preprocessing options.
  • pqnorm Probabilistic Quotient Normalization added to preprocessing options.
  • Group Scale
    • Added option to disable mean centering of block (scale only).
    • Added easier selection of class set to use when identifying blocks.
  • Added "Block Variance Scaling" as new method based on gscale
  • EEM Flitering (flucut)
    • Added better Raman and Rayleigh filtering using interpolation (now the default).
    • Added support for blank subtraction (choose one sample as a blank).
  • glsw Clarified how ELS/EMM and EPO options are related
  • Add support for handling missing data in both normaliz and mscorr (median only).

Other Interfaces

  • Model Optimizer - Better handling of numeric data in comparison table, additional statistics, and improved handling of include field.
    • Add better support for model groupings in PLSDA and SVMDA within model optimizer.
    • Add support for more LWR options
    • LWR models: Add "Survey" button to Analysis window to automatically survey over a range of "Local Points"
  • Better help integration with newer version of Matlab.
  • Hierarchical Model Builder - Add vertical scrolling.
  • TrendTool Add "maximum between" option for markers: returns the maximum value between two markers (better identifies peak value when the peak shape may change)

Model Objects

  • Build and change history now captured in history field of Model Object.
  • Add .scoredistance and .esterror as virtual properties for models. These properties can now be accessed directly from models in PLS_Toolbox or Solo_Predictor scripts.


New Command-line Features and Functions

Misc New Functions

  • eemoutlier - New function for automatically removing outliers in fluorescence PARAFAC models.
  • glog Generalized Log variable scaling.
  • kurtosis - Added kurtosis statistic function to distribution fitting toolbox.
  • mlsca Multi-level Simultaneous Component Analysis.
  • multiblock Create or apply a multiblock model for joining data.
  • pqnorm Probability Quotient Normalization for samples.
  • skewness - Added skewness statistic function to distribution fitting toolbox.

Command-Line Changes

  • als - Sort components by variance captured (if no constraints otherwise defining order).
  • comparemodels - Report the mean (class count weighted) of Classification Error, Precision, F1 Score for classification models.
  • confusionmatrix - Report additional quantities for each class: count, classification error, precision and F1 score..
    • Standardize terminology: TP = count, TPR = proportion (rate) for confusion matrix quantities, and labels shown.
  • cov_cv - Changed from SVDS to SVD to improve behavior with nearly-rank-deficient cases.
  • crossval - Better handling of cross-validation when using PLSDA.
    • Convert plsda regression method input to be 'sim' (to speed it up).
    • Recognize when user has passed single-column (either logical or class) and force it to be multi-column logical.
  • histaxes - Fix for when NaN's are present in data.
  • jmlimit - Better handle degenerate cases when multiple confidence levels are requested (return VECTOR of zeros instead of single zero).
  • matchvars - Add option to input a cell array of dataset objects which will be joined after reconciling variables to make the least changes in data.
  • mdcheck - Allow use of KNN as data replacement method (replace missing data with data from sample(s) which are closest).