rr.accum

Simple, generic accumulator library for online computations

This simple library is heavily inspired by the boost.accumulators library.

The library builds on two basic concepts:

An accumulator represents an online computation of some sort, that is assumed to be fairly expensive or require large amounts of memory. An accumulator observes a stream of data and knows how to update itself incrementally (Accumulator.observe() method).
Accumulator sets tie things together and make it easier to use a set of related accumulators. Accumulator sets manage the dependencies of accumulators on other accumulators, and provide an attribute-like interface to access the current values of its constituent accumulators.

I know I know, this all sounds vague and confusing... To make things clearer, an example is due. The typical example in this kind of scenario is to use basic descriptive statistics, e.g. mean, sum, count, variance. The rr.accum library includes these and other simple statistics ready to use.

Suppose we're interested in observing the mean and variance of a stream of numbers. We declare an accumulator set with those two statistics:

from rr.accum import AccumulatorSet, stats

accums = AccumulatorSet(stats.Mean, stats.Variance, name="foo")
accums.value  # check initial values

We can immediately inspect the initial values, and be surprised to find out that the set includes two statistics that we didn't request: sum and count. These were added automatically by the accumulator set because the mean depends on the sum and count.

Warning

Always list the desired statistics explicitly when creating accumulator sets.

This not only makes your code clearer, but also guarantees that those statistics will be available. Relying on implicit dependencies may suddenly stop working if a statistic's implementation changes and it stops depending on others.

Okay, we are now ready to start pumping numbers into the accumulator set:

import random

for _ in range(10000):
    accums.observe(random.random())
    print("Online mean={} and variance={}".format(accums.mean, accums["variance"]))

The current value of individual statistics can be obtained either through attribute access or subscripting, by passing the statistic's name or one of its aliases. Some accumulators, like WeightedStandardDeviation, have shorter aliases (wstddev) for convenience.

Compatibility

Developed and tested in Python 3.6+. The code may or may not work under earlier versions of Python 3 (perhaps back to 3.3).

Installation

From the github repo:

pip install git+https://github.com/2xR/rr.accum.git

License

This library is released as open source under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src/rr/accum		src/rr/accum
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.rst		README.rst
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rr.accum

Simple, generic accumulator library for online computations

Compatibility

Installation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rr.accum

Simple, generic accumulator library for online computations

Compatibility

Installation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages