Let's start talking about commonality. Both of them are not synchronized, if you really need a thread-safe implementation, check out for SynchronizedDescriptiveStatistics and SynchronizedSummaryStatistics, and both of them implement all the basic functionality defined in

**StatisticalSummary**.

It is a handful of methods that should look pretty straightforward to the reader having just some basic statistics knowledge:

double getMean(); // arithmetic mean double getVariance(); double getStandardDeviation(); double getMax(); double getMin(); long getN(); // number of available values double getSum(); // sum of the valuesAll this methods return NaN when called if no values have been added to the object, with the obvious exception of getN() that returns 0.

The substantial difference is that SummaryStatistics does not store data values in memory, resulting being sleeker and leaner than DescriptiveStatistics. On the other hand, DescriptiveStatistics makes available some more functionality to the user. So, if what you need is in StatisticalSummary, you can manage huge collection of data with SummaryStatistics and happily avoid to pay a large price in terms of memory usage.

There are then a few

**common methods**that are defined for both SummaryStatistics and DescriptiveStatistics, even though they are not part of the commonly implemented interface StatisticalSummary.

To load the data we use public void addValue(double value), that could be called like this, where generator is a Random object previously initialized:

for(int i = 0; i < 1000; ++i) { stats.addValue(generator.nextDouble()); }From object of both classes we can get the sum of the squares, getSumsq(), and the geometric mean, getGeometricMean(). Sometimes it is useful to reset the values on which we are working, and this is done by calling clear().

**Only for SummaryStatistics**are defined getSumOfLogs() and getSecondMoment().

**Only for DescriptiveStatistics**are available:

void removeMostRecentValue(): discards just the last value inserted in the underlying dataset or throws an exception.

double replaceMostRecentValue(double v): replaces the last inserted value or throws an exception.

double getSkewness(): the skewness is a measure of the current distribution asymmetry.

double getKurtosis(): the kurtosis is a measure of the current distribution protrusion.

double[] getValues(): creates a copy of the current data set.

double[] getSortedValues(): creates a sorted copy of the current data set.

double getElement(int index): gets a specific element or throws an exception.

double getPercentile(double p): an estimation of the requested percentile, or throws an exception.

**Window size**

When we have no idea of how many values could be entered, it could be dangerous using DescriptiveStatistics in its default mode, that let the underlying data collection growing without any limit. Better to define the dimension of the "window" we want to work with using setWindowSize(int windowSize). What happens when we reach the limit is that the oldest value is discarded to let room for the new entry. If you wonder what is the current size, you can check it through getWindowSize() that returns, as an int, its current value. The "no window" value is represented by DescriptiveStatistics.INFINITE_WINDOW, defined as -1.

## No comments:

## Post a Comment