Class MultivariateSummaryStatistics
- java.lang.Object
-
- org.apache.commons.math3.stat.descriptive.MultivariateSummaryStatistics
-
- All Implemented Interfaces:
java.io.Serializable
,StatisticalMultivariateSummary
- Direct Known Subclasses:
SynchronizedMultivariateSummaryStatistics
public class MultivariateSummaryStatistics extends java.lang.Object implements StatisticalMultivariateSummary, java.io.Serializable
Computes summary statistics for a stream of n-tuples added using the
addValue
method. The data values are not stored in memory, so this class can be used to compute statistics for very large n-tuple streams.The
StorelessUnivariateStatistic
instances used to maintain summary state and compute statistics are configurable via setters. For example, the default implementation for the mean can be overridden by callingsetMeanImpl(StorelessUnivariateStatistic[])
. Actual parameters to these methods must implement theStorelessUnivariateStatistic
interface and configuration must be completed beforeaddValue
is called. No configuration is necessary to use the default, commons-math provided implementations.To compute statistics for a stream of n-tuples, construct a MultivariateStatistics instance with dimension n and then use
addValue(double[])
to add n-tuples. ThegetXxx
methods where Xxx is a statistic return an array ofdouble
values, where fori = 0,...,n-1
the ith array element is the value of the given statistic for data range consisting of the ith element of each of the input n-tuples. For example, ifaddValue
is called with actual parameters {0, 1, 2}, then {3, 4, 5} and finally {6, 7, 8},getSum
will return a three-element array with values {0+3+6, 1+4+7, 2+5+8}Note: This class is not thread-safe. Use
SynchronizedMultivariateSummaryStatistics
if concurrent access from multiple threads is required.- Since:
- 1.2
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description private VectorialCovariance
covarianceImpl
Covariance statistic implementation - cannot be reset.private StorelessUnivariateStatistic[]
geoMeanImpl
Geometric mean statistic implementation - can be reset by setter.private int
k
Dimension of the data.private StorelessUnivariateStatistic[]
maxImpl
Maximum statistic implementation - can be reset by setter.private StorelessUnivariateStatistic[]
meanImpl
Mean statistic implementation - can be reset by setter.private StorelessUnivariateStatistic[]
minImpl
Minimum statistic implementation - can be reset by setter.private long
n
Count of values that have been addedprivate static long
serialVersionUID
Serialization UIDprivate StorelessUnivariateStatistic[]
sumImpl
Sum statistic implementation - can be reset by setter.private StorelessUnivariateStatistic[]
sumLogImpl
Sum of log statistic implementation - can be reset by setter.private StorelessUnivariateStatistic[]
sumSqImpl
Sum of squares statistic implementation - can be reset by setter.
-
Constructor Summary
Constructors Constructor Description MultivariateSummaryStatistics(int k, boolean isCovarianceBiasCorrected)
Construct a MultivariateSummaryStatistics instance
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addValue(double[] value)
Add an n-tuple to the dataprivate void
append(java.lang.StringBuilder buffer, double[] data, java.lang.String prefix, java.lang.String separator, java.lang.String suffix)
Append a text representation of an array to a buffer.private void
checkDimension(int dimension)
Throws DimensionMismatchException if dimension != k.private void
checkEmpty()
Throws MathIllegalStateException if the statistic is not empty.void
clear()
Resets all statistics and storageboolean
equals(java.lang.Object object)
Returns true iffobject
is aMultivariateSummaryStatistics
instance and all statistics have the same values as this.RealMatrix
getCovariance()
Returns the covariance matrix of the values that have been added.int
getDimension()
Returns the dimension of the dataStorelessUnivariateStatistic[]
getGeoMeanImpl()
Returns the currently configured geometric mean implementationdouble[]
getGeometricMean()
Returns an array whose ith entry is the geometric mean of the ith entries of the arrays that have been added usingaddValue(double[])
double[]
getMax()
Returns an array whose ith entry is the maximum of the ith entries of the arrays that have been added usingaddValue(double[])
StorelessUnivariateStatistic[]
getMaxImpl()
Returns the currently configured maximum implementationdouble[]
getMean()
Returns an array whose ith entry is the mean of the ith entries of the arrays that have been added usingaddValue(double[])
StorelessUnivariateStatistic[]
getMeanImpl()
Returns the currently configured mean implementationdouble[]
getMin()
Returns an array whose ith entry is the minimum of the ith entries of the arrays that have been added usingaddValue(double[])
StorelessUnivariateStatistic[]
getMinImpl()
Returns the currently configured minimum implementationlong
getN()
Returns the number of available valuesprivate double[]
getResults(StorelessUnivariateStatistic[] stats)
Returns an array of the results of a statistic.double[]
getStandardDeviation()
Returns an array whose ith entry is the standard deviation of the ith entries of the arrays that have been added usingaddValue(double[])
double[]
getSum()
Returns an array whose ith entry is the sum of the ith entries of the arrays that have been added usingaddValue(double[])
StorelessUnivariateStatistic[]
getSumImpl()
Returns the currently configured Sum implementationdouble[]
getSumLog()
Returns an array whose ith entry is the sum of logs of the ith entries of the arrays that have been added usingaddValue(double[])
StorelessUnivariateStatistic[]
getSumLogImpl()
Returns the currently configured sum of logs implementationdouble[]
getSumSq()
Returns an array whose ith entry is the sum of squares of the ith entries of the arrays that have been added usingaddValue(double[])
StorelessUnivariateStatistic[]
getSumsqImpl()
Returns the currently configured sum of squares implementationint
hashCode()
Returns hash code based on values of statisticsvoid
setGeoMeanImpl(StorelessUnivariateStatistic[] geoMeanImpl)
Sets the implementation for the geometric mean.private void
setImpl(StorelessUnivariateStatistic[] newImpl, StorelessUnivariateStatistic[] oldImpl)
Sets statistics implementations.void
setMaxImpl(StorelessUnivariateStatistic[] maxImpl)
Sets the implementation for the maximum.void
setMeanImpl(StorelessUnivariateStatistic[] meanImpl)
Sets the implementation for the mean.void
setMinImpl(StorelessUnivariateStatistic[] minImpl)
Sets the implementation for the minimum.void
setSumImpl(StorelessUnivariateStatistic[] sumImpl)
Sets the implementation for the Sum.void
setSumLogImpl(StorelessUnivariateStatistic[] sumLogImpl)
Sets the implementation for the sum of logs.void
setSumsqImpl(StorelessUnivariateStatistic[] sumsqImpl)
Sets the implementation for the sum of squares.java.lang.String
toString()
Generates a text report displaying summary statistics from values that have been added.
-
-
-
Field Detail
-
serialVersionUID
private static final long serialVersionUID
Serialization UID- See Also:
- Constant Field Values
-
k
private int k
Dimension of the data.
-
n
private long n
Count of values that have been added
-
sumImpl
private StorelessUnivariateStatistic[] sumImpl
Sum statistic implementation - can be reset by setter.
-
sumSqImpl
private StorelessUnivariateStatistic[] sumSqImpl
Sum of squares statistic implementation - can be reset by setter.
-
minImpl
private StorelessUnivariateStatistic[] minImpl
Minimum statistic implementation - can be reset by setter.
-
maxImpl
private StorelessUnivariateStatistic[] maxImpl
Maximum statistic implementation - can be reset by setter.
-
sumLogImpl
private StorelessUnivariateStatistic[] sumLogImpl
Sum of log statistic implementation - can be reset by setter.
-
geoMeanImpl
private StorelessUnivariateStatistic[] geoMeanImpl
Geometric mean statistic implementation - can be reset by setter.
-
meanImpl
private StorelessUnivariateStatistic[] meanImpl
Mean statistic implementation - can be reset by setter.
-
covarianceImpl
private VectorialCovariance covarianceImpl
Covariance statistic implementation - cannot be reset.
-
-
Constructor Detail
-
MultivariateSummaryStatistics
public MultivariateSummaryStatistics(int k, boolean isCovarianceBiasCorrected)
Construct a MultivariateSummaryStatistics instance- Parameters:
k
- dimension of the dataisCovarianceBiasCorrected
- if true, the unbiased sample covariance is computed, otherwise the biased population covariance is computed
-
-
Method Detail
-
addValue
public void addValue(double[] value) throws DimensionMismatchException
Add an n-tuple to the data- Parameters:
value
- the n-tuple to add- Throws:
DimensionMismatchException
- if the length of the array does not match the one used at construction
-
getDimension
public int getDimension()
Returns the dimension of the data- Specified by:
getDimension
in interfaceStatisticalMultivariateSummary
- Returns:
- The dimension of the data
-
getN
public long getN()
Returns the number of available values- Specified by:
getN
in interfaceStatisticalMultivariateSummary
- Returns:
- The number of available values
-
getResults
private double[] getResults(StorelessUnivariateStatistic[] stats)
Returns an array of the results of a statistic.- Parameters:
stats
- univariate statistic array- Returns:
- results array
-
getSum
public double[] getSum()
Returns an array whose ith entry is the sum of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getSum
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component sums
-
getSumSq
public double[] getSumSq()
Returns an array whose ith entry is the sum of squares of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getSumSq
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component sums of squares
-
getSumLog
public double[] getSumLog()
Returns an array whose ith entry is the sum of logs of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getSumLog
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component log sums
-
getMean
public double[] getMean()
Returns an array whose ith entry is the mean of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getMean
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component means
-
getStandardDeviation
public double[] getStandardDeviation()
Returns an array whose ith entry is the standard deviation of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getStandardDeviation
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component standard deviations
-
getCovariance
public RealMatrix getCovariance()
Returns the covariance matrix of the values that have been added.- Specified by:
getCovariance
in interfaceStatisticalMultivariateSummary
- Returns:
- the covariance matrix
-
getMax
public double[] getMax()
Returns an array whose ith entry is the maximum of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getMax
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component maxima
-
getMin
public double[] getMin()
Returns an array whose ith entry is the minimum of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getMin
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component minima
-
getGeometricMean
public double[] getGeometricMean()
Returns an array whose ith entry is the geometric mean of the ith entries of the arrays that have been added usingaddValue(double[])
- Specified by:
getGeometricMean
in interfaceStatisticalMultivariateSummary
- Returns:
- the array of component geometric means
-
toString
public java.lang.String toString()
Generates a text report displaying summary statistics from values that have been added.- Overrides:
toString
in classjava.lang.Object
- Returns:
- String with line feeds displaying statistics
-
append
private void append(java.lang.StringBuilder buffer, double[] data, java.lang.String prefix, java.lang.String separator, java.lang.String suffix)
Append a text representation of an array to a buffer.- Parameters:
buffer
- buffer to filldata
- data arrayprefix
- text prefixseparator
- elements separatorsuffix
- text suffix
-
clear
public void clear()
Resets all statistics and storage
-
equals
public boolean equals(java.lang.Object object)
Returns true iffobject
is aMultivariateSummaryStatistics
instance and all statistics have the same values as this.- Overrides:
equals
in classjava.lang.Object
- Parameters:
object
- the object to test equality against.- Returns:
- true if object equals this
-
hashCode
public int hashCode()
Returns hash code based on values of statistics- Overrides:
hashCode
in classjava.lang.Object
- Returns:
- hash code
-
setImpl
private void setImpl(StorelessUnivariateStatistic[] newImpl, StorelessUnivariateStatistic[] oldImpl) throws MathIllegalStateException, DimensionMismatchException
Sets statistics implementations.- Parameters:
newImpl
- new implementations for statisticsoldImpl
- old implementations for statistics- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e. if n > 0)
-
getSumImpl
public StorelessUnivariateStatistic[] getSumImpl()
Returns the currently configured Sum implementation- Returns:
- the StorelessUnivariateStatistic implementing the sum
-
setSumImpl
public void setSumImpl(StorelessUnivariateStatistic[] sumImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the Sum.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
sumImpl
- the StorelessUnivariateStatistic instance to use for computing the Sum- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
getSumsqImpl
public StorelessUnivariateStatistic[] getSumsqImpl()
Returns the currently configured sum of squares implementation- Returns:
- the StorelessUnivariateStatistic implementing the sum of squares
-
setSumsqImpl
public void setSumsqImpl(StorelessUnivariateStatistic[] sumsqImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the sum of squares.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
sumsqImpl
- the StorelessUnivariateStatistic instance to use for computing the sum of squares- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
getMinImpl
public StorelessUnivariateStatistic[] getMinImpl()
Returns the currently configured minimum implementation- Returns:
- the StorelessUnivariateStatistic implementing the minimum
-
setMinImpl
public void setMinImpl(StorelessUnivariateStatistic[] minImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the minimum.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
minImpl
- the StorelessUnivariateStatistic instance to use for computing the minimum- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
getMaxImpl
public StorelessUnivariateStatistic[] getMaxImpl()
Returns the currently configured maximum implementation- Returns:
- the StorelessUnivariateStatistic implementing the maximum
-
setMaxImpl
public void setMaxImpl(StorelessUnivariateStatistic[] maxImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the maximum.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
maxImpl
- the StorelessUnivariateStatistic instance to use for computing the maximum- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
getSumLogImpl
public StorelessUnivariateStatistic[] getSumLogImpl()
Returns the currently configured sum of logs implementation- Returns:
- the StorelessUnivariateStatistic implementing the log sum
-
setSumLogImpl
public void setSumLogImpl(StorelessUnivariateStatistic[] sumLogImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the sum of logs.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
sumLogImpl
- the StorelessUnivariateStatistic instance to use for computing the log sum- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
getGeoMeanImpl
public StorelessUnivariateStatistic[] getGeoMeanImpl()
Returns the currently configured geometric mean implementation- Returns:
- the StorelessUnivariateStatistic implementing the geometric mean
-
setGeoMeanImpl
public void setGeoMeanImpl(StorelessUnivariateStatistic[] geoMeanImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the geometric mean.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
geoMeanImpl
- the StorelessUnivariateStatistic instance to use for computing the geometric mean- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
getMeanImpl
public StorelessUnivariateStatistic[] getMeanImpl()
Returns the currently configured mean implementation- Returns:
- the StorelessUnivariateStatistic implementing the mean
-
setMeanImpl
public void setMeanImpl(StorelessUnivariateStatistic[] meanImpl) throws MathIllegalStateException, DimensionMismatchException
Sets the implementation for the mean.
This method must be activated before any data has been added - i.e., before
addValue
has been used to add data; otherwise an IllegalStateException will be thrown.- Parameters:
meanImpl
- the StorelessUnivariateStatistic instance to use for computing the mean- Throws:
DimensionMismatchException
- if the array dimension does not match the one used at constructionMathIllegalStateException
- if data has already been added (i.e if n > 0)
-
checkEmpty
private void checkEmpty() throws MathIllegalStateException
Throws MathIllegalStateException if the statistic is not empty.- Throws:
MathIllegalStateException
- if n > 0.
-
checkDimension
private void checkDimension(int dimension) throws DimensionMismatchException
Throws DimensionMismatchException if dimension != k.- Parameters:
dimension
- dimension to check- Throws:
DimensionMismatchException
- if dimension != k
-
-