Ann: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Donal
No edit summary
imported>Donal
No edit summary
Line 45: Line 45:
*** model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
*** model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
** '''detail''': sub-structure with additional model details and results, including:
** '''detail''': sub-structure with additional model details and results, including:
*** model.detail.ann.W:  
*** model.detail.ann.W: Structure containing details of the ANN, including the ANN type, number of hidden layers and the weights.


* '''pred''' a structure, similar to '''model''' for the new data.
* '''pred''' a structure, similar to '''model''' for the new data.


====Training Termination====
====Training Termination====
The ANN is trained on the calibration dataset to minimize prediction error, RMSEC. It is important to not over train on the calibration dataset, however, so some some criteria for ending training are needed.
BPN determines the optimal number of learning iteration cycles by selecting the minumum RMSEP for a test subset  
BPN determines the optimal number of learning iteration cycles by selecting the minumum RMSEP for a test subset  
over a range of learning iterations.
over a range of learning iterations.

Revision as of 16:24, 11 January 2014

Purpose

Predictions based on Artificial Neural Network (ANN) regression models.

Synopsis

[model] = ann(x,y,options);
[model] = ann(x,y, nhid, options);
[pred] = ann(x,model,options);
[valid] = ann(x,y,model,options);

Description

Build an ANN model from input X and Y block data using the specified number of layers and layer nodes. Alternatively, if a model is passed in ANN makes a Y prediction for an input test X block. The ANN model contains quantities (weights etc) calculated from the calibration data. When a model structure is passed in to ANN then these weights do not need to be calculated.

There are two implementations of ANN available referred to as 'BPN' and 'Encog'.

BPN is a feedforward ANN using backpropagation training and is implemented in Matlab.
Encog is a feedforward ANN using Resilient Backpropagation training. See Rprop for further details.

Encog is implemented using the Encog framework Encog provided by Heaton Research, Inc, under the Apache 2.0 license. Further details of Encog Neural Network features are available at Encog Documentation. BPN is the ANN version used by default but the user can specify the option 'algorithm' = 'encog' to use Encog instead. Both implementations should give similar results but one may be faster than the other for different datasets. BPN is currently the only version which calculates RMSECV.

Inputs

  • x = X-block (predictor block) class "double" or "dataset", containing numeric values,
  • y = Y-block (predicted block) class "double" or "dataset", containing numeric values,
  • nhid = number of nodes in a single hidden layer ANN, or vector of two two numbers, indicating a two hidden layer ANN, representing the number of nodes in the two hidden layers. (this takes precedence over options nhid1 and nhid2),
  • model = previously generated model (when applying model to new data).

Outputs

  • model = a standard model structure model with the following fields (see Standard Model Structure):
    • modeltype: 'ANN',
    • datasource: structure array with information about input data,
    • date: date of creation,
    • time: time of creation,
    • info: additional model information,
    • pred: 2 element cell array with
      • model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
    • detail: sub-structure with additional model details and results, including:
      • model.detail.ann.W: Structure containing details of the ANN, including the ANN type, number of hidden layers and the weights.
  • pred a structure, similar to model for the new data.

Training Termination

The ANN is trained on the calibration dataset to minimize prediction error, RMSEC. It is important to not over train on the calibration dataset, however, so some some criteria for ending training are needed.

BPN determines the optimal number of learning iteration cycles by selecting the minumum RMSEP for a test subset over a range of learning iterations.

Encog training will terminate whenever either or a) RMSE becomes smaller than option 'terminalrmse', or b) the rate of improvement of RMSE per 100 iterations becomes less than option 'terminalrmserate', or c) time exceeds option 'maxseconds' (though results are not optimal if training is stopped prematurely by this time limit). Note these RMSE values refer to the internal preprocessed and scaled y values.

Options

options = a structure array with the following fields:

  • display : [ 'off' |{'on'}] Governs display
  • plots: [ {'none'} | 'final' ] governs plotting of results, and
  • waitbar : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period.
  • nhid1 : [{2}] Number of nodes in first hidden layer.
  • nhid2 : [{0}] Number of nodes in second hidden layer.
  • learnrate : [0.125] ANN backpropagation learning rate (bpn only).
  • learncycles : [20] Number of ANN learning iterations (bpn only).
  • terminalrmse : [0.05] Termination RMSE value (of scaled y) for ANN iterations (encog only).
  • terminalrmserate : [1.e-9] Termination rate of change of RMSE per 100 iterations (encog only).
  • maxseconds : [{20}] Maximum duration of ANN training in seconds (encog only).
  • preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
  • compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the ANN model. 'pca' uses a simple PCA model to compress the information. 'pls' uses a pls model. Compression can make the ANN more stable and less prone to overfitting.
  • compressncomp: [1] Number of latent variables (or principal components to include in the compression model.
  • blockdetails: [ {'standard'} | 'all' ], extent of predictions and residuals included in model, 'standard' = only y-block, 'all' x- and y-blocks.

See Also

modelselector