Statistics

The goal

There are other (more extensive) statistics packages like​

Questions to David Rotermund

Fisher Exact Test

The Fisher Exact Test is not part of the numpy package. But we need it in machine learning.

scipy.stats.fisher_exact(table, alternative='two-sided')

Perform a Fisher exact test on a 2x2 contingency table.

Order statistics

   
ptp(a[, axis, out, keepdims]) Range of values (maximum - minimum) along an axis.
percentile(a, q[, axis, out, …]) Compute the q-th percentile of the data along the specified axis.
nanpercentile(a, q[, axis, out, …]) Compute the qth percentile of the data along the specified axis, while ignoring nan values.
quantile(a, q[, axis, out, overwrite_input, …]) Compute the q-th quantile of the data along the specified axis.
nanquantile(a, q[, axis, out, …]) Compute the qth quantile of the data along the specified axis, while ignoring nan values.

Averages and variances

   
median(a[, axis, out, overwrite_input, keepdims]) Compute the median along the specified axis.
average(a[, axis, weights, returned, keepdims]) Compute the weighted average along the specified axis.
mean(a[, axis, dtype, out, keepdims, where]) Compute the arithmetic mean along the specified axis.
std(a[, axis, dtype, out, ddof, keepdims, where]) Compute the standard deviation along the specified axis.
var(a[, axis, dtype, out, ddof, keepdims, where]) Compute the variance along the specified axis.
nanmedian(a[, axis, out, overwrite_input, …]) Compute the median along the specified axis, while ignoring NaNs.
nanmean(a[, axis, dtype, out, keepdims, where]) Compute the arithmetic mean along the specified axis, ignoring NaNs.
nanstd(a[, axis, dtype, out, ddof, …]) Compute the standard deviation along the specified axis, while ignoring NaNs.
nanvar(a[, axis, dtype, out, ddof, …]) Compute the variance along the specified axis, while ignoring NaNs.

Correlating

   
corrcoef(x[, y, rowvar, bias, ddof, dtype]) Return Pearson product-moment correlation coefficients.
correlate(a, v[, mode]) Cross-correlation of two 1-dimensional sequences.
cov(m[, y, rowvar, bias, ddof, fweights, …]) Estimate a covariance matrix, given data and weights.

Histograms

   
histogram(a[, bins, range, density, weights]) Compute the histogram of a dataset.
histogram2d(x, y[, bins, range, density, …]) Compute the bi-dimensional histogram of two data samples.
histogramdd(sample[, bins, range, density, …]) Compute the multidimensional histogram of some data.
bincount(x, /[, weights, minlength]) Count number of occurrences of each value in array of non-negative ints.
histogram_bin_edges(a[, bins, range, weights]) Function to calculate only the edges of the bins used by the histogram function.
digitize(x, bins[, right]) Return the indices of the bins to which each value in input array belongs.

The source code is Open Source and can be found on GitHub.