Unhist

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Purpose

Create a vector whose values follow an empirical distribution.

Synopsis

d = unhist(x,y,n);

Description

Given the x and y values for an empirical distribution (histogram) where y is the number of times each value of x is seen, UNHIST returns a vector as close to length n as possible whose values follow the provided distribution as close a possible.

UNHIST is useful when attempting to derive statistical information on the distribution of X values in a given x,y relationship, (the output can be passed into SUMMARY, for example, to get information on the empirical distribution in x) or when a set of values is needed which follow a certain empirical distribution.

The output, d, from:

   d = unhist(x,y,n);

will be a vector close to length n such that the command:

   [hy, hx] = hist(d,x);

would give an hy where hy is an approximation of y except for scale.

The values within y are divided up into n bins and negative values in y are ignored. Note that the output vector may differ from length n because of rounding error while creating bins.


Inputs

  • x = vector of bin centers.
  • y = vector of frequency of occurrence of each bin in x
  • n = (optional) target length for output vector. This also defines the resolution over which y is divided. Larger n leads to finer resolution of y (such that the hy output from [hx,hy]=hist(d) will be a closer approximation of y). Default value = 1000. Actual output length may vary because of rounding on the scaled y values.

Outputs

  • d: vector close to length n which contains y_i occurrences of each corresponding x_i value.


Example

Create example x and y, then use UNHIST to create a larger (n=100) sampling which follows the same distribution:

   x=1:15;
   y = [1 3 4 2 0 0 1 5 8 9 7 4 2 1 0];
   d = unhist(x,y, 100);

View a histogram of the new larger set:

   hist(d,x);

See the distribution of the larger sample set:

   [hy, hx] = hist(d,x);

and a summary of the distribution:

   summary(d');

See Also

plotqq, summary