Tools Cross-Validation: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
imported>Jeremy
No edit summary
Line 9: Line 9:


|-
|-
<td style="width: 18pt">*</td>


|Assess the optimal complexity of a model (for example, the number of principal components in a PCA or PCR model, or the number of latent variables in a PLS model).
|
* Assess the optimal complexity of a model (for example, the number of principal components in a PCA or PCR model, or the number of latent variables in a PLS model).


|}
|}
Line 18: Line 18:


|-
|-
<td style="width: 18pt">*</td>


|Estimate the performance of a model when you apply the model to unknown data.  
|
* Estimate the performance of a model when you apply the model to unknown data.  


|}
|}
Line 29: Line 29:


|-
|-
<td style="width: 18pt">1.</td>
 
|1.


|To open the Cross-Validation tool, do one of the following:
|To open the Cross-Validation tool, do one of the following:
Line 38: Line 39:


|-
|-
<td style="width: 18pt">*</td>


|On the Analysis window, click Tools &gt; Cross-Validation.
|
* On the Analysis window, click Tools &gt; Cross-Validation.


|}
|}
Line 47: Line 48:


|-
|-
<td style="width: 18pt">*</td>


|Click the Cross-Validation icon in the Analysis window.
|
* Click the Cross-Validation icon in the Analysis window.


|}
|}
Line 63: Line 64:


|-
|-
<td style="width: 18pt">*</td>


|In the Analysis window Flowchart pane, click Choose Cross-Validation.
|
* In the Analysis window Flowchart pane, click Choose Cross-Validation.


|}
|}
Line 72: Line 73:


|-
|-
<td style="width: 18pt">2.</td>
 
|2.


|In the Cross-Validation dialog box, select the method of cross-validation that you want to use.
|In the Cross-Validation dialog box, select the method of cross-validation that you want to use.
Line 86: Line 88:


|-
|-
<td style="width: 18pt">3.</td>
 
|3.


|Use the slider bars to change the default values for the available parameters.
|Use the slider bars to change the default values for the available parameters.
Line 97: Line 100:


|-
|-
<td style="width: 18pt">*</td>


|n is the total number of objects in the set of data.
|
* n is the total number of objects in the set of data.


|}
|}
Line 106: Line 109:


|-
|-
<td style="width: 18pt">*</td>


|s is the number of data splits specified for the cross-validation procedure, which must be less than n/2.  
|
* s is the number of data splits specified for the cross-validation procedure, which must be less than n/2.  


|}
|}
Line 115: Line 118:


|-
|-
<td style="width: 18pt">*</td>


|r is the number of iterations.
|
* r is the number of iterations.


|}
|}
Line 128: Line 131:


|-
|-
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 83.25pt;">
 
|


''''''
''''''
</td>
 
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">
|


'''Leave One Out'''
'''Leave One Out'''
</td>
 
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">
|


'''Venetian Blinds'''
'''Venetian Blinds'''
</td>
 
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">
|


'''Contiguous Block'''
'''Contiguous Block'''
</td>
 
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">
|


'''Random Subsets'''
'''Random Subsets'''
</td>
 
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">
|


'''Custom'''
'''Custom'''
</td>


|-
|-
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
 
|
Cross-validation method
Cross-validation method
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
[[Image:CV_Leave_One_Out.jpg|86x126px]]
[[Image:CV_Leave_One_Out.jpg|86x126px]]
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
[[Image:CVB_VenetianBlinds.jpg|84x126px]]
[[Image:CVB_VenetianBlinds.jpg|84x126px]]
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
[[Image:CV_ContinguousBlocks.jpg|85x126px]]
[[Image:CV_ContinguousBlocks.jpg|85x126px]]
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
[[Image:CV_RandomSubsets.jpg|85x127px]]
[[Image:CV_RandomSubsets.jpg|85x127px]]
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


</td>
|


|-
|-
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
 
|
Description
Description
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
The default value. All samples in the set of data are used to build the model.
The default value. All samples in the set of data are used to build the model.
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
Each test set is determined by selecting every sth object in the set of data, starting at objects numbered 1 through s.
Each test set is determined by selecting every sth object in the set of data, starting at objects numbered 1 through s.
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
An alternative to Venetian Blinds. Each test set is determined by selecting contiguous blocks of n/s objects in the set of data, starting at object number 1.
An alternative to Venetian Blinds. Each test set is determined by selecting contiguous blocks of n/s objects in the set of data, starting at object number 1.
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
"s" different test sets are determined through random selection of n/s objects in the set of data, such that no single object is in more than one test set. This procedure is repeated "r" times, where "r" is the number of iterations.  
"s" different test sets are determined through random selection of n/s objects in the set of data, such that no single object is in more than one test set. This procedure is repeated "r" times, where "r" is the number of iterations.  
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
You manually define each of the test sets. You can assign specific objects in your set of data in one of three ways:
You manually define each of the test sets. You can assign specific objects in your set of data in one of three ways:


Line 195: Line 198:


|-
|-
<td style="width: 10.8pt">*</td>


|To be in every test set.
|
* To be in every test set.


|}
|}
Line 204: Line 207:


|-
|-
<td style="width: 10.8pt">*</td>


|To never be in a test set.
|
* To never be in a test set.


|}
|}
Line 213: Line 216:


|-
|-
<td style="width: 10.8pt">*</td>


|To not be used in the cross- validation procedure at all.
|
* To not be used in the cross- validation procedure at all.


|}
|}


</td>
|-


|-
|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Available Parameters
Available Parameters
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
Maximum Number of LVs
Maximum Number of LVs
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|


{|  
{|  


|-
|-
<td style="width: 10.8pt">*</td>


|Maximum Number of LVs
|
* Maximum Number of LVs


|}
|}
Line 242: Line 244:


|-
|-
<td style="width: 10.8pt">*</td>


|Number of Data Splits
|
* Number of Data Splits


|}
|}


</td>
|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


{|  
{|  


|-
|-
<td style="width: 10.8pt">*</td>


|Maximum Number of LVs
|
* Maximum Number of LVs


|}
|}
Line 263: Line 264:


|-
|-
<td style="width: 10.8pt">*</td>


|Number of Data Splits
|
* Number of Data Splits


|}
|}


</td>
|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


{|  
{|  


|-
|-
<td style="width: 10.8pt">*</td>


|Maximum Number of LVs
|
* Maximum Number of LVs


|}
|}
Line 284: Line 284:


|-
|-
<td style="width: 10.8pt">*</td>


|Number of Data Splits
|
* Number of Data Splits


|}
|}
Line 293: Line 293:


|-
|-
<td style="width: 10.8pt">*</td>


|Number of Iterations
|
* Number of Iterations


|}
|}


</td>
|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


{|  
{|  


|-
|-
<td style="width: 10.8pt">*</td>


|Number of data splits  
|
* Number of data splits  


|}
|}
Line 314: Line 313:


|-
|-
<td style="width: 10.8pt">*</td>


|Object membership for each split  
|
* Object membership for each split  


|}
|}
Line 323: Line 322:


|-
|-
<td style="width: 10.8pt">*</td>


|Total number of objects
|
* Total number of objects


|}
|}


</td>
|-


|-
|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
<nowiki>#</nowiki> of Subvalidation Steps
<nowiki>#</nowiki> of Subvalidation Steps
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
n
n
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
s
s
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
s
s
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
(s*r)
(s*r)
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
s
s
</td>


|-
|-
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
 
|
<nowiki>#</nowiki> of Test Samples per Subvalidation
<nowiki>#</nowiki> of Test Samples per Subvalidation
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
1
1
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
n/s
n/s
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
n/s
n/s
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
n/s
n/s
</td>
 
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
|
Varies. User-defined.
Varies. User-defined.
</td>


</table>
</table>
Line 376: Line 373:


|-
|-
<td style="width: 18pt">4.</td>
 
|4.


|Do one of the following:
|Do one of the following:
Line 385: Line 383:


|-
|-
<td style="width: 18pt">*</td>


|Click Apply button to apply these settings and keep the Cross Validation dialog box open.
|
* Click Apply button to apply these settings and keep the Cross Validation dialog box open.


|}
|}
Line 394: Line 392:


|-
|-
<td style="width: 18pt">*</td>


|Click OK to apply these settings and close the Cross Validation dialog box.  
|
* Click OK to apply these settings and close the Cross Validation dialog box.  


|}
|}

Revision as of 12:24, 29 July 2010

Table of Contents | Previous | Next

Cross-Validation Tool

You use the Cross-Validation tool to:

  • Assess the optimal complexity of a model (for example, the number of principal components in a PCA or PCR model, or the number of latent variables in a PLS model).
  • Estimate the performance of a model when you apply the model to unknown data.

For a given set of data, cross-validation involves a series of steps called subvalidation steps in which you remove a subset of objects from a set of data (the test set), build of a model using the remaining objects in the set of data (the model building set), and then apply the resulting model to the removed objects. You note how the errors accumulate as you leave out samples to determine the number of principal components/latent variables/factors to retain in the model. Cross-validation typically involves more than one subvalidation step, each of which in turn involves the selection of different subsets of samples for model building and model testing. In Solo, five different cross-validation methods are available, and these methods vary with respect to how the different sample subsets are selected for these subvalidation steps.

1. To open the Cross-Validation tool, do one of the following:
  • On the Analysis window, click Tools > Cross-Validation.
  • Click the Cross-Validation icon in the Analysis window.

Note: You must load data into the Analysis window before the Cross-Validation icon is available.

Cross-validation icon in the Analysis window
Cross validation icon Analysis window.png
  • In the Analysis window Flowchart pane, click Choose Cross-Validation.
2. In the Cross-Validation dialog box, select the method of cross-validation that you want to use.
Cross-Validation dialog box
File:Cross validation icon dialog box.png
3. Use the slider bars to change the default values for the available parameters.

Note: Not all parameters are relevant for all cross-validation methods. The initial values that are specified for the available parameters are default values that are based on the dimensionality of the data. You can click Reset at any time to reset the parameters to their default settings. For the following descriptions:

  • n is the total number of objects in the set of data.
  • s is the number of data splits specified for the cross-validation procedure, which must be less than n/2.
  • r is the number of iterations.
Cross-validation methods compared
|- | ' | Leave One Out | Venetian Blinds | Contiguous Block | Random Subsets | Custom |- | Cross-validation method | CV Leave One Out.jpg | CVB VenetianBlinds.jpg | CV ContinguousBlocks.jpg | CV RandomSubsets.jpg | |- | Description | The default value. All samples in the set of data are used to build the model. | Each test set is determined by selecting every sth object in the set of data, starting at objects numbered 1 through s. | An alternative to Venetian Blinds. Each test set is determined by selecting contiguous blocks of n/s objects in the set of data, starting at object number 1. | "s" different test sets are determined through random selection of n/s objects in the set of data, such that no single object is in more than one test set. This procedure is repeated "r" times, where "r" is the number of iterations. | You manually define each of the test sets. You can assign specific objects in your set of data in one of three ways:
  • To be in every test set.
  • To never be in a test set.
  • To not be used in the cross- validation procedure at all.

|-

| Available Parameters

| Maximum Number of LVs

|

  • Maximum Number of LVs
  • Number of Data Splits

|

  • Maximum Number of LVs
  • Number of Data Splits

|

  • Maximum Number of LVs
  • Number of Data Splits
  • Number of Iterations

|

  • Number of data splits
  • Object membership for each split
  • Total number of objects

|-

| # of Subvalidation Steps

| n

| s

| s

| (s*r)

| s

|-

| # of Test Samples per Subvalidation

| 1

| n/s

| n/s

| n/s

| Varies. User-defined.

4. Do one of the following:
  • Click Apply button to apply these settings and keep the Cross Validation dialog box open.
  • Click OK to apply these settings and close the Cross Validation dialog box.