Statistics

The goal

There are other (more extensive) statistics packages like

The Fisher Exact Test is not part of the numpy package. But we need it in machine learning.

scipy.stats.fisher_exact(table, alternative='two-sided')

Perform a Fisher exact test on a 2x2 contingency table.


ptp(a[, axis, out, keepdims])	Range of values (maximum - minimum) along an axis.
percentile(a, q[, axis, out, …])	Compute the q-th percentile of the data along the specified axis.
nanpercentile(a, q[, axis, out, …])	Compute the qth percentile of the data along the specified axis, while ignoring nan values.
quantile(a, q[, axis, out, overwrite_input, …])	Compute the q-th quantile of the data along the specified axis.
nanquantile(a, q[, axis, out, …])	Compute the qth quantile of the data along the specified axis, while ignoring nan values.


median(a[, axis, out, overwrite_input, keepdims])	Compute the median along the specified axis.
average(a[, axis, weights, returned, keepdims])	Compute the weighted average along the specified axis.
mean(a[, axis, dtype, out, keepdims, where])	Compute the arithmetic mean along the specified axis.
std(a[, axis, dtype, out, ddof, keepdims, where])	Compute the standard deviation along the specified axis.
var(a[, axis, dtype, out, ddof, keepdims, where])	Compute the variance along the specified axis.
nanmedian(a[, axis, out, overwrite_input, …])	Compute the median along the specified axis, while ignoring NaNs.
nanmean(a[, axis, dtype, out, keepdims, where])	Compute the arithmetic mean along the specified axis, ignoring NaNs.
nanstd(a[, axis, dtype, out, ddof, …])	Compute the standard deviation along the specified axis, while ignoring NaNs.
nanvar(a[, axis, dtype, out, ddof, …])	Compute the variance along the specified axis, while ignoring NaNs.


corrcoef(x[, y, rowvar, bias, ddof, dtype])	Return Pearson product-moment correlation coefficients.
correlate(a, v[, mode])	Cross-correlation of two 1-dimensional sequences.
cov(m[, y, rowvar, bias, ddof, fweights, …])	Estimate a covariance matrix, given data and weights.


histogram(a[, bins, range, density, weights])	Compute the histogram of a dataset.
histogram2d(x, y[, bins, range, density, …])	Compute the bi-dimensional histogram of two data samples.
histogramdd(sample[, bins, range, density, …])	Compute the multidimensional histogram of some data.
bincount(x, /[, weights, minlength])	Count number of occurrences of each value in array of non-negative ints.
histogram_bin_edges(a[, bins, range, weights])	Function to calculate only the edges of the bins used by the histogram function.
digitize(x, bins[, right])	Return the indices of the bins to which each value in input array belongs.

The source code is Open Source and can be found on GitHub.