bolster.stats.distributions

Statistical distribution fitting utilities.

Provides best_fit_distribution(), which tests a large set of scipy.stats continuous distributions against observed data and returns the one with the lowest sum-of-squared-errors against the empirical histogram.

Intended for exploratory data analysis where you want a quick sanity-check on which parametric family best describes your data before committing to a more rigorous approach.

Note

This module depends on scipy and numpy. The fitting loop is marked # pragma: no cover because it is compute-intensive and not suitable for CI; validate results manually or in a notebook.

Functions

best_fit_distribution(data[, bins, ax, include_slow, ...])

Model data by finding best fit distribution to data.

Module Contents

bolster.stats.distributions.best_fit_distribution(data, bins=200, ax=None, include_slow=False, discriminator='sse')[source]

Model data by finding best fit distribution to data.