ToolboxPerformance: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Mathias
imported>Donal
No edit summary
 
(28 intermediate revisions by one other user not shown)
Line 1: Line 1:
==PLS_Toolbox Performance==
==PLS_Toolbox Performance==


The following performance results are for general comparison and expectation. Your own mileage may vary.
This page presents performance results from the main regression methods in PLS_Toolbox to allow comparison of their time and memory requirements when analyzing datasets of various sizes. These results were obtained using a Mac OSX (El Capitan) computer with a 2.8 GHz Intel CPU and 16 GB RAM. Matlab version R2015a and PLS_Toolbox version 8.1 were used.


All results were obtained by building models from the command line using synthetic random datasets. Autoscale preprocessing was used with the default option values for all methods.  Cross-validation was not performed.   


{| class="wikitable"
 
|+ Performance Table
 
! Matlab Versoin !! PLS_Toolbox Version !! Operating System !! System Description|| Data Description !! Algorithm !! Performance Result
'''PCA time in seconds required to build model'''
 
{| border="1" cellpadding="5" cellspacing="0"
 
| ||'''1000 variables'''||'''2000 variables'''||'''5000 variables'''
 
|- valign="top"
| |'''20000 samples'''
||  2
||  4.7
||  40
 
|- valign="top"
| |'''50000 samples'''
||  3.5
||  9
||  61
|-
|}
 
''' PCA memory requirements'''
 
{| border="1" cellpadding="5" cellspacing="0"
 
| ||'''1000 variables'''||'''2000 variables'''||'''5000 variables'''
 
|- valign="top"
| |'''20000 samples'''
||  0.2 GB
||  1 GB
||  3.5 GB
 
|- valign="top"
| |'''50000 samples'''
|| 3.55
|| 9 GB
|| 10.5 GB
 
|-
|}
 
 
 
'''PCR time in seconds required to build model'''
 
{| border="1" cellpadding="5" cellspacing="0"
 
| ||'''1000 variables'''||'''2000 variables'''||'''5000 variables'''
 
|- valign="top"
| |'''20000 samples'''
||  3
||  6
||  44
|- valign="top"
| |'''50000 samples'''
||  5
||  12
||  71
|-
|}
 
'''PCR memory requirements'''
 
{| border="1" cellpadding="5" cellspacing="0"
 
| ||'''1000 variables'''||'''2000 variables'''||'''5000 variables'''
|- valign="top"
| |'''20000 samples'''
|| 0.2 GB
||  1 GB
||  4 GB 
|- valign="top"
| |'''50000 samples'''
||  0.5 GB
||  4 GB
||  11
|-
|}
 
 
 
''' PLS time in seconds required to build model'''
 
{| border="1" cellpadding="5" cellspacing="0"
 
| ||'''1000 variables'''||'''2000 variables'''||'''5000 variables'''
 
|- valign="top"  
| |'''20000 samples'''
|| 3.3
|| 8 
|| 43  
|- valign="top"
| |'''50000 samples'''
|| 8
|| 18
|| 98
|-
|}
 
'''PLS Memory requirements'''
 
{| border="1" cellpadding="5" cellspacing="0"
 
| ||'''100 variables'''||'''500 variables'''||'''1000 variables'''
|- valign="top"
| |'''100 samples'''
|| 1 GB
|| 2 GB
|| 5 GB
|- valign="top"
| |'''500 samples'''
|| 1.6 GB
|| 5.2 GB
|| 13 GB
|-
|}
 
 
 
'''LWR time in seconds required to build model'''
 
{| border="1" cellpadding="5" cellspacing="0"
| ||'''1000 variables'''||'''5000 variables'''||'''10000 variables'''
|- valign="top"
| |'''20000 samples'''
||  4
|| 65 
|| 76 
|- valign="top"
| |'''50000 samples'''
|| 10
|| 77
||  670
|-
|-
| 2015a || 8.1.1 || OS X El Capitan || 2.8 GHz Intel, 16 GB ram || cell || cell
|}
|}


''' LWR memory requirements'''


{| border="1" cellpadding="5" cellspacing="0"
| ||'''1000 variables'''||'''2000 variables'''||'''5000 variables'''
|- valign="top"
| |'''20000 samples'''
||  <1 GB
||  2 GB
||  3.5 GB
|- valign="top"
| |'''50000 samples'''
||  0.6 GB
||  3 GB
||  6.75 GB
|-
|}






'''Table 1. Properties of different cross-validation methods in Solo and PLS_Toolbox.'''
'''ANN time in seconds required to build model'''


{| border="1" cellpadding="5" cellspacing="0"
{| border="1" cellpadding="5" cellspacing="0"
| ||'''100 variables'''||'''500 variables'''||'''1000 variables'''
|- valign="top"
| |'''500 samples'''
||  6
|| 28 
|| 95 
|- valign="top"
| |'''1000 samples'''
|| 10
||  370
||  360
|- valign="top"
| |'''2000 samples'''
||  12
||  550
||  2810
|-
|}


| ||'''Venetian Blinds'''||'''Contiguous Blocks'''||'''Random Subsets'''||'''Leave-One Out'''||'''Custom'''
'''SVM time in seconds required to train model'''


{| border="1" cellpadding="5" cellspacing="0"
| ||'''100 variables'''||'''500 variables'''||'''2000 variables'''
|- valign="top"  
|- valign="top"  
| |'''Test sample selection scheme'''
| |'''100 samples'''
||
|| 8
[[Image:Cv_vet.jpg||| ]]
|| 28 
||  
|| 105 
[[Image:Cv_con.jpg||| ]]
|- valign="top"
||  
| |'''500 samples'''
[[Image:Cv_rnd.jpg||| ]]
|| 150
||  
|| 640
[[Image:Cv_loo.jpg||| ]]
|| 2370
||
* User-defined subsets
* Can "force" specific objects into every test set, every model set, or exclude them from the CV procedure
|- valign="top"  
|- valign="top"  
| |'''Parameters'''
| |'''1000 samples'''
||
||
* Number of Data Splits (s)
* Maximum number of PCs/LVs
* Total number of objects/samples (n)
||
||
* Number of Data Splits (s)
* Maximum number of PCs/LVs
* Total number of objects/samples (n)
||
||
* Number of Data Splits (s)
|-
* Number of iterations (r)
|}
* Maximum number of PCs/LVs
 
* Total number of objects/samples (n)
 
||
'''SVM with PCA compression time in seconds required to build model'''
* Maximum number of PCs/LVs
 
* Total number of objects/samples (n)
{| border="1" cellpadding="5" cellspacing="0"
||
| ||'''100 variables'''||'''500 variables'''||'''1000 variables'''
* Number of data splits (s)
|- valign="top"
* Object membership for each split
| |'''100 samples'''
* All user-defined
|| 4
* Total number of objects/samples (n)
|| 4 
|| 4
|- valign="top"
| |'''500 samples'''
|| 38
|| 38
|| 38
|- valign="top"  
|- valign="top"  
| |'''Number of sub-validation experiments'''
| |'''1000 samples'''
||
||
= s
||
||
= s
||
||
= (s * r)
|-
|}
 
'''SVM memory requirements'''


{| border="1" cellpadding="5" cellspacing="0"
| ||'''100 variables'''||'''500 variables'''||'''1000 variables'''
|- valign="top"
| |'''100 samples'''
||
||
= n
|| 
|| 
|- valign="top"
| |'''500 samples'''
||
||
= s
|-
|'''Number of test samples per sub-validation'''
||
||
= n/s
||
||
= n/s
|- valign="top"
| |'''1000 samples'''
||
||
= n/s
||
||
=1
||
||
* Can vary, user defined
|-
|-
|}
|}

Latest revision as of 22:35, 13 September 2016

PLS_Toolbox Performance

This page presents performance results from the main regression methods in PLS_Toolbox to allow comparison of their time and memory requirements when analyzing datasets of various sizes. These results were obtained using a Mac OSX (El Capitan) computer with a 2.8 GHz Intel CPU and 16 GB RAM. Matlab version R2015a and PLS_Toolbox version 8.1 were used.

All results were obtained by building models from the command line using synthetic random datasets. Autoscale preprocessing was used with the default option values for all methods. Cross-validation was not performed.


PCA time in seconds required to build model

1000 variables 2000 variables 5000 variables
20000 samples 2 4.7 40
50000 samples 3.5 9 61

PCA memory requirements

1000 variables 2000 variables 5000 variables
20000 samples 0.2 GB 1 GB 3.5 GB
50000 samples 3.55 9 GB 10.5 GB


PCR time in seconds required to build model

1000 variables 2000 variables 5000 variables
20000 samples 3 6 44
50000 samples 5 12 71

PCR memory requirements

1000 variables 2000 variables 5000 variables
20000 samples 0.2 GB 1 GB 4 GB
50000 samples 0.5 GB 4 GB 11


PLS time in seconds required to build model

1000 variables 2000 variables 5000 variables
20000 samples 3.3 8 43
50000 samples 8 18 98

PLS Memory requirements

100 variables 500 variables 1000 variables
100 samples 1 GB 2 GB 5 GB
500 samples 1.6 GB 5.2 GB 13 GB


LWR time in seconds required to build model

1000 variables 5000 variables 10000 variables
20000 samples 4 65 76
50000 samples 10 77 670

LWR memory requirements

1000 variables 2000 variables 5000 variables
20000 samples <1 GB 2 GB 3.5 GB
50000 samples 0.6 GB 3 GB 6.75 GB


ANN time in seconds required to build model

100 variables 500 variables 1000 variables
500 samples 6 28 95
1000 samples 10 370 360
2000 samples 12 550 2810


SVM time in seconds required to train model

100 variables 500 variables 2000 variables
100 samples 8 28 105
500 samples 150 640 2370
1000 samples


SVM with PCA compression time in seconds required to build model

100 variables 500 variables 1000 variables
100 samples 4 4 4
500 samples 38 38 38
1000 samples

SVM memory requirements

100 variables 500 variables 1000 variables
100 samples
500 samples
1000 samples