This tool calculates various descriptive statistics on a set of samples stored in a text file: mean, median, variance, standard deviation, confidence interval around the mean and median. It can also produce an histogram of the samples. In that case, be careful at the step value used, since this parameter may dramatically change the results.
This tool can be either used interactively (the user must answer two questions then), or automatically (which is usefull to launch the tool recursively on all files of a directory). For instance you can use:
find trace_dir -name *.trc -print -exec descr_stats 1 {} noninter\;which will call descr_stats for all files whose name has a ".trc" extension in the "trace_dir" directory and its sub-directories, taking into account the data present in the first column.
If you are not familiar with the various notions, please read this document.
Here is a simple demo. The data file contains the following samples:
$ cat ~/tmp/test_val 1.1 49 50 49.2 49.224 47 48.5 51.0 50.3 46.9 52.1And here are the results of running our descr_stats tool:
$ ~/work/descr_stats/descr_stats 1 ~/tmp/test_val ------------------------------------------------------ nb of samples = 11 mean = 44.938545 median = 49.200001 variance = 213.801060 standard deviation = 14.621938 range = 51.000000 min = 1.100000 max = 52.099998 confidence interval around mean 44.938545: 90: +/- 43.838547 95: +/- 43.838547 99: +/- 43.838547 confidence interval around median 49.200001: 90: +/- 2.899998 95: +/- 48.100002 99: +/- 48.100002 ------------------------------------------------------ Continue with histogram (produces a gnuplot .dem format) (y/n)[n] ? y Enter sampling step (real number > 0.0): 0.1 histogram data file is: /tmp/histo22098.dat histogram gnuplot file is: /tmp/histo22098.demThe histo*.dem file can then be given to gnuplot to produce the histogram.
$ gnuplot /tmp/histo22098.dem Hit return to continueand here is the histogram: