Class StorelessCovariance


  • public class StorelessCovariance
    extends Covariance
    Covariance implementation that does not require input data to be stored in memory. The size of the covariance matrix is specified in the constructor. Specific elements of the matrix are incrementally updated with calls to incrementRow() or increment Covariance().

    This class is based on a paper written by Philippe Pébay: Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments, 2008, Technical Report SAND2008-6212, Sandia National Laboratories.

    Note: the underlying covariance matrix is symmetric, thus only the upper triangular part of the matrix is stored and updated each increment.

    Since:
    3.0
    • Field Detail

      • dimension

        private int dimension
        dimension of the square covariance matrix
    • Constructor Detail

      • StorelessCovariance

        public StorelessCovariance​(int dim)
        Create a bias corrected covariance matrix with a given dimension.
        Parameters:
        dim - the dimension of the square covariance matrix
      • StorelessCovariance

        public StorelessCovariance​(int dim,
                                   boolean biasCorrected)
        Create a covariance matrix with a given number of rows and columns and the indicated bias correction.
        Parameters:
        dim - the dimension of the covariance matrix
        biasCorrected - if true the covariance estimate is corrected for bias, i.e. n-1 in the denominator, otherwise there is no bias correction, i.e. n in the denominator.
    • Method Detail

      • initializeMatrix

        private void initializeMatrix​(boolean biasCorrected)
        Initialize the internal two-dimensional array of StorelessBivariateCovariance instances.
        Parameters:
        biasCorrected - if the covariance estimate shall be corrected for bias
      • indexOf

        private int indexOf​(int i,
                            int j)
        Returns the index (i, j) translated into the one-dimensional array used to store the upper triangular part of the symmetric covariance matrix.
        Parameters:
        i - the row index
        j - the column index
        Returns:
        the corresponding index in the matrix array
      • getCovariance

        public double getCovariance​(int xIndex,
                                    int yIndex)
                             throws NumberIsTooSmallException
        Get the covariance for an individual element of the covariance matrix.
        Parameters:
        xIndex - row index in the covariance matrix
        yIndex - column index in the covariance matrix
        Returns:
        the covariance of the given element
        Throws:
        NumberIsTooSmallException - if the number of observations in the cell is < 2
      • increment

        public void increment​(double[] data)
                       throws DimensionMismatchException
        Increment the covariance matrix with one row of data.
        Parameters:
        data - array representing one row of data.
        Throws:
        DimensionMismatchException - if the length of rowData does not match with the covariance matrix
      • append

        public void append​(StorelessCovariance sc)
                    throws DimensionMismatchException
        Appends sc to this, effectively aggregating the computations in sc with this. After invoking this method, covariances returned should be close to what would have been obtained by performing all of the increment(double[]) operations in sc directly on this.
        Parameters:
        sc - externally computed StorelessCovariance to add to this
        Throws:
        DimensionMismatchException - if the dimension of sc does not match this
        Since:
        3.3
      • getData

        public double[][] getData()
                           throws NumberIsTooSmallException
        Return the covariance matrix as two-dimensional array.
        Returns:
        a two-dimensional double array of covariance values
        Throws:
        NumberIsTooSmallException - if the number of observations for a cell is < 2