Tools Cross-Validation: Difference between revisions
imported>Jeremy No edit summary |
imported>Jeremy No edit summary |
||
Line 197: | Line 197: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>To be in every test set.</td> | <td>To be in every test set.</td> | ||
</tr> | </tr> | ||
Line 204: | Line 204: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>To never be in a test set.</td> | <td>To never be in a test set.</td> | ||
</tr> | </tr> | ||
Line 211: | Line 211: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>To not be used in the cross- validation procedure at all.</td> | <td>To not be used in the cross- validation procedure at all.</td> | ||
</tr> | </tr> | ||
Line 229: | Line 229: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Maximum Number of LVs</td> | <td>Maximum Number of LVs</td> | ||
</tr> | </tr> | ||
Line 236: | Line 236: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Number of Data Splits</td> | <td>Number of Data Splits</td> | ||
</tr> | </tr> | ||
Line 246: | Line 246: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Maximum Number of LVs</td> | <td>Maximum Number of LVs</td> | ||
</tr> | </tr> | ||
Line 253: | Line 253: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Number of Data Splits</td> | <td>Number of Data Splits</td> | ||
</tr> | </tr> | ||
Line 263: | Line 263: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Maximum Number of LVs</td> | <td>Maximum Number of LVs</td> | ||
</tr> | </tr> | ||
Line 270: | Line 270: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Number of Data Splits</td> | <td>Number of Data Splits</td> | ||
</tr> | </tr> | ||
Line 277: | Line 277: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Number of Iterations</td> | <td>Number of Iterations</td> | ||
</tr> | </tr> | ||
Line 287: | Line 287: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Number of data splits </td> | <td>Number of data splits </td> | ||
</tr> | </tr> | ||
Line 294: | Line 294: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Object membership for each split </td> | <td>Object membership for each split </td> | ||
</tr> | </tr> | ||
Line 301: | Line 301: | ||
<table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | <table border="0" cellspacing="0" cellpadding="0" width="99%" summary=""> | ||
<tr valign="baseline"> | <tr valign="baseline"> | ||
<td style="width: 10.8pt"> | <td style="width: 10.8pt">*</td> | ||
<td>Total number of objects</td> | <td>Total number of objects</td> | ||
</tr> | </tr> | ||
Line 322: | Line 322: | ||
</td> | </td> | ||
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;"> | <td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;"> | ||
(s | (s*r) | ||
</td> | </td> | ||
<td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;"> | <td style="border-bottom-color: #000000; border-bottom-style: solid; border-bottom-width: 1px; border-left-color: #000000; border-left-style: solid; border-left-width: 1px; border-right-color: #000000; border-right-style: solid; border-right-width: 1px; border-top-color: #000000; border-top-style: solid; border-top-width: 1px; padding-bottom: 1pt; padding-left: 6pt; padding-right: 6pt; padding-top: 6pt; vertical-align: top;"> |
Revision as of 12:49, 29 July 2010
Table of Contents | Previous | Next
Cross-Validation Tool
You use the Cross-Validation tool to:
|
|
For a given set of data, cross-validation involves a series of steps called subvalidation steps in which you remove a subset of objects from a set of data (the test set), build of a model using the remaining objects in the set of data (the model building set), and then apply the resulting model to the removed objects. You note how the errors accumulate as you leave out samples to determine the number of principal components/latent variables/factors to retain in the model. Cross-validation typically involves more than one subvalidation step, each of which in turn involves the selection of different subsets of samples for model building and model testing. In Solo, five different cross-validation methods are available, and these methods vary with respect to how the different sample subsets are selected for these subvalidation steps.
1. | To open the Cross-Validation tool, do one of the following: |
|
|
Note: You must load data into the Analysis window before the Cross-Validation icon is available.
- Cross-validation icon in the Analysis window
|
2. | In the Cross-Validation dialog box, select the method of cross-validation that you want to use. |
- Cross-Validation dialog box
3. | Use the slider bars to change the default values for the available parameters. |
Note: Not all parameters are relevant for all cross-validation methods. The initial values that are specified for the available parameters are default values that are based on the dimensionality of the data. You can click Reset at any time to reset the parameters to their default settings. For the following descriptions:
|
|
|
- Cross-validation methods compared
' |
Leave One Out |
Venetian Blinds |
Contiguous Block |
Random Subsets |
Custom |
||||||||||||||||||||
Cross-validation method |
|||||||||||||||||||||||||
Description |
The default value. All samples in the set of data are used to build the model. |
Each test set is determined by selecting every sth object in the set of data, starting at objects numbered 1 through s. |
An alternative to Venetian Blinds. Each test set is determined by selecting contiguous blocks of n/s objects in the set of data, starting at object number 1. |
"s" different test sets are determined through random selection of n/s objects in the set of data, such that no single object is in more than one test set. This procedure is repeated "r" times, where "r" is the number of iterations. |
You manually define each of the test sets. You can assign specific objects in your set of data in one of three ways:
|
||||||||||||||||||||
Available Parameters |
Maximum Number of LVs |
|
|
|
|
||||||||||||||||||||
# of Subvalidation Steps |
n |
s |
s |
(s*r) |
s |
||||||||||||||||||||
# of Test Samples per Subvalidation |
1 |
n/s |
n/s |
n/s |
Varies. User-defined. |
4. | Do one of the following: |
|
|