Tools Cross-Validation: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
__TOC__
__TOC__
[[TableOfContents|Table of Contents]] | [[ModelApplication_ValidationPhase|Previous]] | [[Tools_ModelRobustness|Next]]
[[TableOfContents|Table of Contents]] | [[ModelApplication_ValidationPhase|Previous]] | [[Tools_ModelRobustness|Next]]


Line 6: Line 7:
You use the Cross-Validation tool to:
You use the Cross-Validation tool to:


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
{|
<tr valign="baseline">
 
<td style="width: 18pt">*</td>
|- valign="top"  
<td>Assess the optimal complexity of a model (for example, the number of principal components in a PCA or PCR model, or the number of latent variables in a PLS model).</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|
<tr valign="baseline">
* Assess the optimal complexity of a model (for example, the number of principal components in a PCA or PCR model, or the number of latent variables in a PLS model).
<td style="width: 18pt">*</td>
<td>Estimate the performance of a model when you apply the model to unknown data. </td>
</tr>
</table>


For a given set of data, cross-validation involves a series of steps called subvalidation steps in which you remove a subset of objects from a set of data (the test set), build of a model using the remaining objects in the set of data (the model building set), and then apply the resulting model to the removed objects. You note how the errors accumulate as you leave out samples to determine the number of principal components/latent variables/factors to retain in the model. Cross-validation typically involves more than one subvalidation step, each of which in turn involves the selection of different subsets of samples for model building and model testing. In Solo, five different cross-validation methods are available, and these methods vary with respect to how the different sample subsets are selected for these subvalidation steps.
|}


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
{|
<tr valign="baseline">
<td style="width: 18pt">1.</td>
<td>To open the Cross-Validation tool, do one of the following:</td>
</tr>
</table>


:
|- valign="top"  
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>On the Analysis window, click Tools &gt; Cross-Validation.</td>
</tr>
</table>


:
|
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
* Estimate the expected performance of a model when you apply the model to unknown data.  
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>Click the Cross-Validation icon in the Analysis window.</td>
</tr>
</table>


Note: You must load data into the Analysis window before the Cross-Validation icon is available.
|}


::''Cross-validation icon in the Analysis window''
For a given set of data, cross-validation involves a series of steps called subvalidation steps in which you remove a subset of objects from a set of data (the test set), build of a model using the remaining objects in the set of data (the model building set), and then apply the resulting model to the removed objects. You note how the errors accumulate as you leave out samples to determine the number of principal components/latent variables/factors to retain in the model. Cross-validation typically involves more than one subvalidation step, each of which in turn involves the selection of different subsets of samples for model building and model testing. In Solo, five different cross-validation methods are available, and these methods vary with respect to how the different sample subsets are selected for these subvalidation steps.


::[[Image:Cross_validation_icon_Analysis_window.png|406x83px]]
{|  
::
 
|- valign="top"


:
|1.
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>In the Analysis window Flowchart pane, click Choose Cross-Validation.</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|To open the Cross-Validation tool, do one of the following:
<tr valign="baseline">
<td style="width: 18pt">2.</td>
<td>In the Cross-Validation dialog box, select the method of cross-validation that you want to use.</td>
</tr>
</table>


::''Cross-Validation dialog box''
|}


::[[Image:Cross_validation_icon_dialog_box.png|288x138px]]
{| style="margin-left:18pt" 
::


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|- valign="top"  
<tr valign="baseline">
<td style="width: 18pt">3.</td>
<td>Use the slider bars to change the default values for the available parameters.</td>
</tr>
</table>


Note: Not all parameters are relevant for all cross-validation methods. The initial values that are specified for the available parameters are default values that are based on the dimensionality of the data. You can click Reset at any time to reset the parameters to their default settings. For the following descriptions:
|
* On the Analysis window, click Tools &gt; Cross-Validation.


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|}
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>n is the total number of objects in the set of data.</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
{| style="margin-left:18pt"
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>s is the number of data splits specified for the cross-validation procedure, which must be less than n/2. </td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|- valign="top"  
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>r is the number of iterations.</td>
</tr>
</table>


::''Cross-validation methods compared''
|
* Click the Cross-Validation icon in the Analysis window.
'''Note:''' You must load data into the Analysis window before the Cross-Validation icon is available.
|}


:''Cross-validation icon in the Analysis window''
::[[Image:Cross_validation_icon_Analysis_window.png|406x83px]]
::
::
::
<table style="border-collapse: collapse; margin-bottom: 12.0pt; margin-left: 0pt; margin-right: 0pt; margin-top: 3.0pt; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; text-align: left; width: 683.2512pt;" cellspacing="0" summary="">


<tr>
{| style="margin-left:18pt" 
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 83.25pt;">
 
|- valign="top"
 
|
* In the Analysis window Flowchart pane, click Choose Cross-Validation.
 
|}


''''''
{|
</td>
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">


'''Leave One Out'''
|- valign="top"  
</td>
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">


'''Venetian Blinds'''
|2.
</td>
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">


'''Contiguous Block'''
|In the Cross-Validation dialog box, select the method of cross-validation that you want to use. You will notice a visual representation of the splits at the top of the window. Other relevant information and warnings will appear at the bottom of the window.
</td>
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">


'''Random Subsets'''
|}
</td>
<td style="background-color: #E8E8E8; border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top; width: 120.00024pt;">


'''Custom'''
:''Cross-Validation dialog box''
</td>
</tr>
<tr>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Cross-validation method
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
[[Image:CV_Leave_One_Out.jpg|86x126px]]
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
[[Image:CVB_VenetianBlinds.jpg|84x126px]]
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
[[Image:CV_ContinguousBlocks.jpg|85x126px]]
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
[[Image:CV_RandomSubsets.jpg|85x127px]]
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


</td>
::[[Image:CrossValGUI.png|500px||CrossVal Window]]
</tr>
::
<tr>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Description
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
The default value. All samples in the set of data are used to build the model.
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Each test set is determined by selecting every sth object in the set of data, starting at objects numbered 1 through s.
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
An alternative to Venetian Blinds. Each test set is determined by selecting contiguous blocks of n/s objects in the set of data, starting at object number 1.
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
"s" different test sets are determined through random selection of n/s objects in the set of data, such that no single object is in more than one test set. This procedure is repeated "r" times, where "r" is the number of iterations.
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
You manually define each of the test sets. You can assign specific objects in your set of data in one of three ways:


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
{|
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>To be in every test set.</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|- valign="top"  
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>To never be in a test set.</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|3.
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>To not be used in the cross- validation procedure at all.</td>
</tr>
</table>


</td>
|Use the slider bars to change the default values for the available parameters.
</tr>
<tr>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Available Parameters
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Maximum Number of LVs
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|}
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Maximum Number of LVs</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
:'''Note:''' Not all parameters are relevant for all cross-validation methods. The initial values that are specified for the available parameters are default values that are based on the dimensionality of the data. You can click Reset at any time to reset the parameters to their default settings. See the [[Using_Cross-Validation]] page for details on how to use the different methods and settings.
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Number of Data Splits</td>
</tr>
</table>


</td>
{|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|- valign="top"  
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Maximum Number of LVs</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|4.
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Number of Data Splits</td>
</tr>
</table>


</td>
|Do one of the following:
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|}
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Maximum Number of LVs</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
{| style="margin-left:18pt"
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Number of Data Splits</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|- valign="top"  
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Number of Iterations</td>
</tr>
</table>


</td>
|
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
* Click Apply button to apply these settings and keep the Cross Validation dialog box open.


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|}
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Number of data splits </td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
{| style="margin-left:18pt"
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Object membership for each split </td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|- valign="top"  
<tr valign="baseline">
<td style="width: 10.8pt">*</td>
<td>Total number of objects</td>
</tr>
</table>


</td>
|
</tr>
* Click OK to apply these settings and close the Cross Validation dialog box.  
<tr>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
<nowiki>#</nowiki> of Subvalidation Steps
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
n
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
s
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
s
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
(s*r)
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
s
</td>
</tr>
<tr>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
<nowiki>#</nowiki> of Test Samples per Subvalidation
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
1
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
n/s
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
n/s
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
n/s
</td>
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;">
Varies. User-defined.
</td>
</tr>
</table>


<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
|}
<tr valign="baseline">
<td style="width: 18pt">4.</td>
<td>Do one of the following:</td>
</tr>
</table>


:
===Technical Details===
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>Click Apply button to apply these settings and keep the Cross Validation dialog box open.</td>
</tr>
</table>


:
More technical details can be found on the [[Using_Cross-Validation]] page.
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary="">
<tr valign="baseline">
<td style="width: 18pt">*</td>
<td>Click OK to apply these settings and close the Cross Validation dialog box. </td>
</tr>
</table>

Latest revision as of 13:16, 11 August 2020

Table of Contents | Previous | Next

Cross-Validation Tool

You use the Cross-Validation tool to:

  • Assess the optimal complexity of a model (for example, the number of principal components in a PCA or PCR model, or the number of latent variables in a PLS model).
  • Estimate the expected performance of a model when you apply the model to unknown data.

For a given set of data, cross-validation involves a series of steps called subvalidation steps in which you remove a subset of objects from a set of data (the test set), build of a model using the remaining objects in the set of data (the model building set), and then apply the resulting model to the removed objects. You note how the errors accumulate as you leave out samples to determine the number of principal components/latent variables/factors to retain in the model. Cross-validation typically involves more than one subvalidation step, each of which in turn involves the selection of different subsets of samples for model building and model testing. In Solo, five different cross-validation methods are available, and these methods vary with respect to how the different sample subsets are selected for these subvalidation steps.

1. To open the Cross-Validation tool, do one of the following:
  • On the Analysis window, click Tools > Cross-Validation.
  • Click the Cross-Validation icon in the Analysis window.

Note: You must load data into the Analysis window before the Cross-Validation icon is available.

Cross-validation icon in the Analysis window
Cross validation icon Analysis window.png
  • In the Analysis window Flowchart pane, click Choose Cross-Validation.
2. In the Cross-Validation dialog box, select the method of cross-validation that you want to use. You will notice a visual representation of the splits at the top of the window. Other relevant information and warnings will appear at the bottom of the window.
Cross-Validation dialog box
CrossVal Window
3. Use the slider bars to change the default values for the available parameters.
Note: Not all parameters are relevant for all cross-validation methods. The initial values that are specified for the available parameters are default values that are based on the dimensionality of the data. You can click Reset at any time to reset the parameters to their default settings. See the Using_Cross-Validation page for details on how to use the different methods and settings.
4. Do one of the following:
  • Click Apply button to apply these settings and keep the Cross Validation dialog box open.
  • Click OK to apply these settings and close the Cross Validation dialog box.

Technical Details

More technical details can be found on the Using_Cross-Validation page.