descr_stats: a Tool for Descriptive Statistics


What's that?

This tool calculates various descriptive statistics on a set of samples stored in a text file: mean, median, variance, standard deviation, confidence interval around the mean and median. It can also produce an histogram of the samples. In that case, be careful at the step value used, since this parameter may dramatically change the results.

This tool can be either used interactively (the user must answer two questions then), or automatically (which is usefull to launch the tool recursively on all files of a directory). For instance you can use:

find trace_dir -name *.trc -print -exec descr_stats 1 {} noninter\;
which will call descr_stats for all files whose name has a ".trc" extension in the "trace_dir" directory and its sub-directories, taking into account the data present in the first column.

If you are not familiar with the various notions, please read this document.

Example

Here is a simple demo. The data file contains the following samples:

And here are the results of running our descr_stats tool: The histo*.dem file can then be given to gnuplot to produce the histogram. and here is the histogram: histo.png

Distribution

The latest release...

Previous releases...