Statistics
The goal
There are other (more extensive) statistics packages like
Questions to David Rotermund
Fisher Exact Test
The Fisher Exact Test is not part of the numpy package. But we need it in machine learning.
scipy.stats.fisher_exact(table, alternative='two-sided')
Perform a Fisher exact test on a 2x2 contingency table.
Order statistics
ptp(a[, axis, out, keepdims]) | Range of values (maximum - minimum) along an axis. |
percentile(a, q[, axis, out, …]) | Compute the q-th percentile of the data along the specified axis. |
nanpercentile(a, q[, axis, out, …]) | Compute the qth percentile of the data along the specified axis, while ignoring nan values. |
quantile(a, q[, axis, out, overwrite_input, …]) | Compute the q-th quantile of the data along the specified axis. |
nanquantile(a, q[, axis, out, …]) | Compute the qth quantile of the data along the specified axis, while ignoring nan values. |
Averages and variances
median(a[, axis, out, overwrite_input, keepdims]) | Compute the median along the specified axis. |
average(a[, axis, weights, returned, keepdims]) | Compute the weighted average along the specified axis. |
mean(a[, axis, dtype, out, keepdims, where]) | Compute the arithmetic mean along the specified axis. |
std(a[, axis, dtype, out, ddof, keepdims, where]) | Compute the standard deviation along the specified axis. |
var(a[, axis, dtype, out, ddof, keepdims, where]) | Compute the variance along the specified axis. |
nanmedian(a[, axis, out, overwrite_input, …]) | Compute the median along the specified axis, while ignoring NaNs. |
nanmean(a[, axis, dtype, out, keepdims, where]) | Compute the arithmetic mean along the specified axis, ignoring NaNs. |
nanstd(a[, axis, dtype, out, ddof, …]) | Compute the standard deviation along the specified axis, while ignoring NaNs. |
nanvar(a[, axis, dtype, out, ddof, …]) | Compute the variance along the specified axis, while ignoring NaNs. |
Correlating
corrcoef(x[, y, rowvar, bias, ddof, dtype]) | Return Pearson product-moment correlation coefficients. |
correlate(a, v[, mode]) | Cross-correlation of two 1-dimensional sequences. |
cov(m[, y, rowvar, bias, ddof, fweights, …]) | Estimate a covariance matrix, given data and weights. |
Histograms
histogram(a[, bins, range, density, weights]) | Compute the histogram of a dataset. |
histogram2d(x, y[, bins, range, density, …]) | Compute the bi-dimensional histogram of two data samples. |
histogramdd(sample[, bins, range, density, …]) | Compute the multidimensional histogram of some data. |
bincount(x, /[, weights, minlength]) | Count number of occurrences of each value in array of non-negative ints. |
histogram_bin_edges(a[, bins, range, weights]) | Function to calculate only the edges of the bins used by the histogram function. |
digitize(x, bins[, right]) | Return the indices of the bins to which each value in input array belongs. |
The source code is Open Source and can be found on GitHub.