Analyzeparticles

From Eigenvector Research Documentation Wiki
Revision as of 16:57, 7 February 2011 by imported>Donal
Jump to navigation Jump to search

Purpose

ANALYZEPARTCLES Identify particles (blobs, connected regions), and their properties, in an image dataset.

Synopsis

[imgdso model] = analyzeparticles(imagedso, options);
[imgdso model] = analyzeparticles(imagedso);
[imgdso model] = analyzeparticles(imagedso, model);

Description

The particle analysis functionality is used to automatically identify particle-like areas in an image and to return information about the identified particles’ characteristics such as their area, shape and pixel values. A particle is considered to be an isolated contiguous region of pixels within the image which have similar intensity values or color values. Particles are also known as “connected regions” or “blobs”.

Our image analysis software can analyze Particles in images using either the “analyzeparticles” Matlab function or the “Particle Analysis” GUI, which is a graphical interface to that function. The analyzeparticles function itself is implemented using the ImageJ image analysis package (http://rsb.info.nih.gov/ij/) which is included with MIA. Analyzeparticles integrates the ImageJ “Analyze Particles” feature into MIA so it can be conveniently used with the Eigenvector dataset object and the other MIA/PLS_Toolbox tools.

Inputs

  • x = X-block (predictor block) class "double" or "dataset", containing numeric values,
  • y = Y-block (predicted block) class "double" or "dataset", containing numeric values,
  • model = previously generated model (when applying model to new data).

Outputs

  • model = a standard model structure model with the following fields (see MODELSTRUCT):
    • modeltype: 'SVM',
    • datasource: structure array with information about input data,
    • date: date of creation,
    • time: time of creation,
    • info: additional model information,
    • pred: 2 element cell array with
      • model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
    • detail: sub-structure with additional model details and results, including:
      • model.detail.svm.model: Matlab version of the libsvm svm_model (Java)
      • model.detail.svm.cvscan: results of CV parameter scan
      • model.detail.svm.outlier: results of outlier detection (one-class svm)
  • pred a structure, similar to model for the new data.

Options

options = a structure array with the following fields:

  • display: [ 'off' | {'on'} ], governs level of display to command window,
  • plots [ 'none' | {'final'} ], governs level of plotting,
  • preprocessing: {[]} preprocessing structures for x block (see PREPROCESS). NOTE that y-block preprocessing is NOT used with SVMs. Any y-preprocessing will be ignored.
  • compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the SVM model. 'pca' uses a simple PCA model to compress the information. 'pls' uses either a pls or plsda model (depending on the svmtype). Compression can make the SVM more stable and less prone to overfitting.
  • blockdetails: [ {'standard'} | 'all' ], extent of predictions and residuals included in model, 'standard' = only y-block, 'all' x- and y-blocks.
  • algorithm: [ 'libsvm' ] algorithm to use. libsvm is default and currently only option.
  • kerneltype: [ 'linear' | {'rbf'} ], SVM kernel to use. 'rbf' is default.
  • svmtype: [ {'epsilon-svr'} | 'nu-svr' ] Type of SVM to apply. The default is 'epsilon-svr' for regression.
  • probabilityestimates: [0| {1} ], whether to train the SVR model for probability estimates, 0 or 1 (default 1)"
  • cvtimelimit: Set a time limit (seconds) on individual cross-validation sub-calculation when searching over supplied SVM parameter ranges for optimal parameters. Only relevant if parameter ranges are used for SVM parameters such as cost, epsilon, gamma or nu. Default is 10;
  • splits: Number of subsets to divide data into when applying n-fold cross validation. Default is 5.
  • gamma: Value(s) to use for LIBSVM kernel gamma parameter. Default is 15 values from 10^-6 to 10, spaced uniformly in log.
  • cost: Value(s) to use for LIBSVM 'c' parameter. Default is 11 values from 10^-3 to 100, spaced uniformly in log.
  • epsilon: Value(s) to use for LIBSVM 'p' parameter (epsilon in loss function). Default is the set of values [1.0, 0.1, 0.01].
  • nu: Value(s) to use for LIBSVM 'n' parameter (nu of nu-SVC, and nu-SVR). Default is the set of values [0.2, 0.5, 0.8].
  • outliernu: Value to use for nu in LIBSVM's one-class svm outlier detection. (0.05).

Algorithm

There are two stages to particle analysis of an image dataset object (DSO). The first stage is to obtain a binary image where image pixel values are either 0 or 1 where one value represents non-particle pixels and the other represents potential particle pixels. This is usually accomplished by specifying a threshold level where pixel having values below or above the threshold value are assigned value 0 or 1. If the image DSO has multiple slabs then one slab must be selected to determine the binary image or else the average of all the slabs can be used. A pixel assigned value 1 is not automatically part of a particle because other particle criteria can be specified such as a minimum particle area requirement, or other shape restriction. Once these filters are applied there may remain some particle regions.

The second stage in particle analysis is to calculate the properties of each particle region. Properties include area, perimeter, centroid coordinates, shape properties (circularity, aspect ratio, roundness, and solidity) and Feret’s diameters (Feret diameter, FeretX, FeretY, FeretAngle and MinFeret). There are other particle properties which depend on the particle’s pixel values including mean, median, minimum, maximum, and standard deviation. These are calculated for each slab for each particle.

epsilon-SVR and nu-SVR

There are two commonly used versions of SVM regression, 'epsilon-SVR' and 'nu-SVR'. The original SVM formulations for Classification (SVC) and Regression (SVR) used parameters C [0, inf) and epsilon[0, inf) to apply a penalty to the optimization for points which were not correctly separated by the classifying hyperplane or for prediction errors greater than epsilon. Alternative versions of both SVM classification and regression were later developed where these penalty parameters were replaced by an alternative parameter, nu [0,1], which applies a slightly different penalty. The main motivation for the nu versions of SVM is that it has a has a more meaningful interpretation. This is because nu represents an upper bound on the fraction of training samples which are errors (misclassified, or poorly predicted) and a lower bound on the fraction of samples which are support vectors. Some users feel nu is more intuitive to use than C or epsilon. C/epsilon or nu are just different versions of the penalty parameter. The same optimization problem is solved in either case. Thus it should not matter which form of SVM you use, C versus nu for classification or epsilon versus nu for regression. PLS_Toolbox uses the C and epsilon versions since these were the original formulations and are the most commonly used forms. For more details on 'nu' SVMs see [1]

See Also

analysis, svmda