Open Questions on the Bernoulli Factory Problem

Background

Suppose there is a coin that shows heads with an unknown probability, $\lambda$. The goal is to use that coin (and possibly also a fair coin) to build a "new" coin that shows heads with a probability that depends on $\lambda$, call it $f(\lambda)$. This is the Bernoulli factory problem, and it can be solved for a function $f(\lambda)$ only if it's continuous. (For example, flipping the coin twice and taking heads only if exactly one coin shows heads, the probability $2\lambda(1-\lambda)$ can be simulated.)

This page contains several questions about the Bernoulli factory problem. Answers to them will greatly improve my pages on this site about Bernoulli factories. If you can answer any of them, post an issue in the GitHub issues page.

Note: The Bernoulli factory problem is a special case of a more general mathematical problem that I call "The Sampling Problem".

Background
Contents
Polynomials that approach a Bernoulli factory function "fast"
Other Questions
Notes

Polynomials that approach a Bernoulli factory function "fast"

This question involves solving the Bernoulli factory problem with polynomials.¹

In this question, a polynomial $P(x)$ is written in Bernstein form of degree $n$ if it is written as—

$$P(x)=\sum_{k=0}^n a_k {n \choose k} x^k (1-x)^{n-k},$$

where the real numbers $a_0, ..., a_n$ are the polynomial's Bernstein coefficients.

The degree-$n$ Bernstein polynomial of an arbitrary function $f(x)$ has Bernstein coefficients $a_k = f(k/n)$. In general, this Bernstein polynomial differs from $f$ even if $f$ is a polynomial.

Main Question

Suppose $f:[0,1]\to [0,1]$ is continuous and belongs to a large class of functions (for example, the $k$-th derivative, $k\ge 0$, is continuous, Lipschitz continuous, concave, strictly increasing, or bounded variation, or $f$ is real analytic).

(Exact Bernoulli factory): Compute the Bernstein coefficients of a sequence of polynomials ($g_n$) of degree 2, 4, 8, ..., $2^i$, ... that converge to $f$ from below and satisfy: $(g_{2n}-g_{n})$ is a polynomial with nonnegative Bernstein coefficients once it's rewritten to a polynomial in Bernstein form of degree exactly $2n$. Assume $0\lt f(\lambda)\lt 1$ or a piecewise polynomial can come between $f$ or $1-f$ and the x-axis.
(Approximate Bernoulli factory): Given $\epsilon > 0$, compute the Bernstein coefficients of a polynomial or rational function (of some degree $n$) that is within $\epsilon$ of $f$.²
(Series expansion of simple functions): Find a nonnegative random variable $X$ and a series $f(\lambda)=\sum_{a\ge 0}\gamma_a(\lambda)$ such that $\gamma_a(\lambda)/\mathbb{P}(X=a)$ (letting 0/0 equal 0) is a polynomial or rational function with rational Bernstein coefficients lying in $[0, 1]$.³ Do the same for a function that is within $\epsilon$ of $f$, rather than $f$.

The convergence rate must be $O(1/n^{r/2})$ if the class has only functions with Lipschitz-continuous $(r-1)$-th derivative. (Emphasis is given to cases where $r\ge 3$.) The method may not introduce transcendental or trigonometric functions (as with Chebyshev interpolants).

Solving the Bernoulli factory problem with polynomials

An algorithm (Łatuszyński et al. 2009/2011)⁴ simulates a function that admits a Bernoulli factory $f(\lambda)$ via two sequences of polynomials that converge from above and below to that function. Roughly speaking, the algorithm works as follows:

Generate U, a uniform random variate in $[0, 1]$.
Flip the input coin (with a probability of heads of $\lambda$), then build an upper and lower bound for $f(\lambda)$, based on the outcomes of the flips so far. In this case, these bounds come from two degree-$n$ polynomials that approach $f$ as $n$ gets large, where $n$ is the number of coin flips so far in the algorithm.
If U is less than or equal to the lower bound, return 1. If U is greater than the upper bound, return 0. Otherwise, go to step 2.

The result of the algorithm is 1 with probability exactly equal to $f(\lambda)$, or 0 otherwise.

However, the algorithm requires the polynomial sequences to meet certain requirements, one of which is:

For $f(\lambda)$ there must be a sequence of polynomials ($g_n$) in Bernstein form of degree 1, 2, 3, ... that converge to $f$ from below and satisfy: $(g_{n+1}-g_{n})$ is a polynomial with nonnegative Bernstein coefficients once it's rewritten to a polynomial in Bernstein form of degree exactly $n+1$ (Nacu and Peres (2005)⁵; Holtz et al. (2011)⁶).⁷ For $f(\lambda)=1-f(\lambda)$ there must likewise be a sequence of this kind.

A Matter of Efficiency

However, ordinary Bernstein polynomials converge to a function at the rate $\Omega(1/n)$ unless the function is linear, a result known since Voronovskaya (1932)⁸ and a rate that will lead to an infinite expected number of coin flips in general. (See also my supplemental notes.)

But Lorentz (1966)⁹ showed that if the function is positive and has a continuous $k$-th derivative, there are polynomials with nonnegative Bernstein coefficients that converge at the rate $O(1/n^{k/2})$ (and thus can enable a finite expected number of coin flips if the function is "smooth" enough; for example, if the function's second or higher-order derivative is Lipschitz continuous).

Thus, researchers have studied alternatives to Bernstein polynomials that improve the convergence rate for "smoother" functions. See Holtz et al. (2011)⁶, Sevy (1991)¹⁰, Waldron (2009)¹¹, Costabile et al. (2005)¹², Han (2003)¹³, Khosravian-Arab et al. (2018)¹⁴, and references therein; see also Micchelli (1973)¹⁵, Güntürk and Li (2021a)¹⁶, (2021b)¹⁷, Draganov (2024)¹⁸, and Tachev (2022)¹⁹.

These alternative polynomials usually come with results where the error bound is the desired $O(1/n^{k/2})$, but most of those results (with the notable exception of Sevy) have hidden constants with no upper bounds given, making them unimplementable (that is, it can't be known beforehand whether a given polynomial will come close to the target function within a user-specified error tolerance).

A Conjecture on Polynomial Approximation

The following is a conjecture that could help reduce this problem to the problem of finding explicit error bounds when approximating a function by polynomials.

Let $f(\lambda):[0,1]\to(0,1)$ have a continuous $r$-th derivative, where $r\ge 1$, let $M$ be the maximum of the absolute value of $f$ and its derivatives up to the $r$-th derivative, and denote the Bernstein polynomial of degree $n$ of a function $g$ as $B_n(g)$. Let $W_{2^0}(g; \lambda), W_{2^1}(g; \lambda), ..., W_{2^i}(g; \lambda),...$ be a sequence of operators that map a continuous function $g$ on $[0, 1]$ to a bounded function on $[0, 1]$ and converge uniformly to $f$.

Prove or disprove the following statement. For each integer $n\ge 1$ that's a power of 2, suppose that there is $D>0$ such that—

$$\text{abs}(f(\lambda)-B_n(W_n(f; \lambda); \lambda)) \le DM/n^{r/2},$$

whenever $0\le \lambda\le 1$. Then there is $C_0\ge D$ such that the polynomials $(g_n)$ in Bernstein form of degree 2, 4, 8, ..., $2^i$, ..., defined as $g_n=B_n(W_n(f; \lambda); \lambda) - C_0 M/n^{r/2}$, converge from below to $f$ and satisfy: $(g_{2n}-g_{n})$ is a polynomial with nonnegative Bernstein coefficients once it's rewritten to a polynomial in Bernstein form of degree exactly $2n$. Equivalently (see also Nacu and Peres (2005)⁵), there is $C_1>0$ such that the inequality—

$$W_{2n}\left(f; \frac{k}{2n}\right) - \sum_{i=0}^n W_n\left(f; \frac{i}{n}\right)\sigma_{n,k,i}\ge -C_1 M/n^{r/2},\tag{PB}$$

holds true for each integer $n\ge 1$ that's a power of 2 and whenever $0\le k\le 2n$, where $\sigma_{n,k,i} = {n\choose i}{n\choose {k-i}}/{2n \choose k}=\mathbb{P}(X_k=i)$ and $X_k$ is a hypergeometric($2n$, $k$, $n$) random variable. $C_0$ or $C_1$ may depend on $r$ and the sequence $(W_n)$, but not on $f$, $\lambda$, or $n$. When $C_0$ or $C_1$ exists, find a good upper bound for it.

Note: This conjecture may be easy to prove if $W_n$ reproduces polynomials of degree $(r-1)$ or less. But there are $B_n(W_n)$ (notably the iterated Boolean sum of Bernstein polynomials) that don't do so and yet converge at the rate $O(n^{-r/2})$ for some $r\gt 2$. Also, see notes 3 and 4 in "End Notes".

Note: I believe there is a counterexample to this conjecture, namely the sequence $B_n(W_n(f; \lambda); \lambda)=\frac{(T_n(1-2\lambda)+1)\varphi_n}{2 \mu_n} + 1/2$, where $\varphi_n$ is a decreasing sequence of positive numbers that tends slowly enough to 0, $\mu_n$ is the maximum Bernstein coefficient (in absolute value) of the degree-$n$ polynomial $(T_n(1-2\lambda)+1)/2$, and $T_n(x)$ is the Chebyshev polynomial of the first kind of degree $n$. $W_n$ then maps to a piecewise linear function that connects the Bernstein coefficients of $B_n(W_n(\lambda))$, so that $(W_n)$ is a sequence of operators that converges at an arbitrarily slow rate (depending on $\varphi_n$) to the constant 1/2. $B_n(W_n(\lambda))$ converges uniformly, at an exponential rate, to $f(\lambda)=1/2$, so that $M = 1/2$. If this counterexample is valid, the conjecture may still be true with an additional assumption on the convergence rate of $W_n$, say, $O(1/n)$ or $O(1/n^{r/2})$ or $O(1/n^{(r-1)/2})$.

Note: If $W_n(f)$ is a linear operator, the left-hand side of $(PB)$ can be treated as a linear operator, too, after a change of variables from $k$ to $2n\lambda$, with $0\le\lambda\le 1$. Call the new operator $L_n(f)$. The goal is then to find an upper bound for $L_n$ that is $O(1/n^{r/2})$. Based on the proof techniques in several academic works ((Acu et al. 2018, theorem 2.7)²⁰, (Khosravian-Arab et al. 2018, theorem 15)²¹, (Ditzian and Totik 1987)²²), the following can be shown. Let $L(g)$ be a linear operator that maps a continuous function $g(x)$ on $[0, 1]$ to a continuous function on that interval. Let $f(x)$ be a function with a continuous $r$-th derivative on $[0, 1]$, where $k$ is a positive integer. Let $t$ and $x$ be numbers in that interval. Then $L(f)$ can be written as—

$$L(f)(x) = L(1)(x)\cdot f(x) + \left(\sum_{i=1}^r L((t-x)^i)(x)\frac{f^{(i)}(x)}{i!}\right)$$

$$+ \frac{1}{r!} L((t-x)^r(f^{(r)}(x_1)-f^{(r)}(x)))(x),$$

for some number $x_1$ between $t$ and $x$. In this equation, the last term is the result of applying $L$ to the Taylor remainder of $f$ at $t$. Thus, to find upper bounds for $L(f)$, it's enough to find upper bounds for—

$L((t-x)^i)$ (the so-called central moments) for $i$ from 0 through $r$), and

$L((t-x)^r(f^{(r)}(x_1)-f^{(r)}(x)))$, which is harder to find, especially if $L$ is not a positive linear operator. (An example of a nonpositive linear operator is $L(f)=2f-B_n(f)$, where $B_n(f)$ is the degree-$n$ Bernstein polynomial of $f$.)

Strategies

The following are some strategies for answering these questions:

Verify my proofs for the results on error bounds for certain polynomials in "Results Used in Approximations By Polynomials", including:
- Iterated Boolean sums (linear combinations of iterates) of Bernstein polynomials ($B_n(W_n) = f-(f-B_n(f))^k$:²³ Propositions B10C and B10D.
- Linear combinations of Bernstein polynomials (see Costabile et al. (2005)¹²): Proposition B10.
- The Lorentz operator (Holtz et al. 2011)⁶.
Find the hidden constants $\theta_\alpha$, $s$, and $D$ as well as those in Lemmas 15, 17 to 22, 24, and 25 in Holtz et al. (2011)⁶.
Find operators or functions of the following kinds and find explicit bounds, with no hidden constants, on the approximation error:
- Operators that produce a degree-$n$ polynomial in Bernstein form, or a ratio of two such polynomials, such that—
  - the operator preserves polynomials at a higher degree than linear functions, or
  - $O(n^2)$ sample points are required.
- Operators that produce polynomials from samples at rational values of a function $f$ that cluster at a quadratic rate toward the endpoints (Adcock et al. 2019)²⁴ (for example, values that converge to Chebyshev points $\cos(j\pi/n)$ with increasing $n$, or to Legendre points). See also 7, 8, and 12 of Trefethen, Approximation Theory and Approximation Practice, 2013.

Notes

See also the following questions on Mathematics Stack Exchange and MathOverflow: Converging polynomials, Error bounds, A conjecture, Lorentz operators, Series representations. ↩
As discussed later, the Bernstein polynomials solve this question for functions no "smoother" than arbitrary functions in $C^2[0, 1]$, but their convergence rate is $O(1/n)$ in general. For these functions, there may be rational-function approximations or more practical methods with a faster convergence rate. By contrast, results for functions "smoother" than $C^2$ are hard to find. ↩
An example of $X$ is $\mathbb{P}(X=a) = p (1-p)^a$ where $0 < p < 1$ is a known rational. This question's requirements imply that $\sum_{a\ge 0}\max_\lambda \text{abs}(\gamma_a(\lambda)) \le 1$. The proof of Keane and O'Brien (1994) produces a convex combination of polynomials with 0 and 1 as Bernstein coefficients, but the combination is difficult to construct (it requires finding maximums, for example) and so this proof does not appropriately answer this question. ↩
Łatuszyński, K., Kosmidis, I., Papaspiliopoulos, O., Roberts, G.O., "Simulating events of unknown probabilities via reverse time martingales", arXiv:0907.4018v2 [stat.CO], 2009/2011. ↩
Nacu, Şerban, and Yuval Peres. "Fast simulation of new coins from old", The Annals of Applied Probability 15, no. 1A (2005): 93-115. ↩ ↩² ↩³
Holtz, O., Nazarov, F., Peres, Y., "New Coins from Old, Smoothly", Constructive Approximation 33 (2011). ↩ ↩² ↩³ ↩⁴
The condition on nonnegative Bernstein coefficients ensures that not only the polynomials "increase" to $f(\lambda)$, but also their Bernstein coefficients. This condition is equivalent in practice to the following statement (Nacu & Peres 2005). For every integer $n\ge 1$ that's a power of 2, $a(2n, k)\ge\mathbb{E}[a(n, X_{n,k})]= \left(\sum_{i=0}^k a(n,i) {n\choose i}{n\choose {k-i}}/{2n\choose k}\right)$, where $a(n,k)$ is the degree-$n$ polynomial's $k$-th Bernstein coefficient, where $0\le k\le 2n$ is an integer, and where $X_{n,k}$ is a hypergeometric($2n$, $k$, $n$) random variable. A hypergeometric($2n$, $k$, $n$) random variable is the number of "good" balls out of $k$ balls taken uniformly at random, all at once, from a bag containing $2n$ balls, $n$ of which are "good". ↩
E. Voronovskaya, "Détermination de la forme asymptotique d'approximation des fonctions par les polynômes de M. Bernstein", 1932. ↩
G.G. Lorentz, "The degree of approximation by polynomials with positive coefficients", 1966. ↩
Sevy, J., “Acceleration of convergence of sequences of simultaneous approximants”, dissertation, Drexel University, 1991. ↩
Waldron, S., "Increasing the polynomial reproduction of a quasi-interpolation operator", Journal of Approximation Theory 161 (2009). ↩
Costabile, F., Gualtieri, M.I., Serra, S., “Asymptotic expansion and extrapolation for Bernstein polynomials with applications”, BIT 36 (1996) ↩ ↩²
Han, Xuli. “Multi-node higher order expansions of a function.” Journal of Approximation Theory 124.2 (2003): 242-253. https://doi.org/10.1016/j.jat.2003.08.001 ↩
Khosravian-Arab, Hassan, Mehdi Dehghan, and M. R. Eslahchi. "A new approach to improve the order of approximation of the Bernstein operators: theory and applications." Numerical Algorithms 77 (2018): 111-150. ↩
Micchelli, Charles. "The saturation class and iterates of the Bernstein polynomials", Journal of Approximation Theory 8, no. 1 (1973): 1-18. ↩
Güntürk, C. Sinan, and Weilin Li. "Approximation with one-bit polynomials in Bernstein form", arXiv:2112.09183 (2021); Constructive Approximation, pp.1-30 (2022). ↩
Güntürk, C. Sinan, and Weilin Li. "Approximation of functions with one-bit neural networks", arXiv:2112.09181 (2021). ↩
Draganov, B.R., "Simultaneous approximation by the Bernstein operator", dissertation, Sofia University "St. Kliment Ohridski", 2024. ↩
Tachev, Gancho. "Linear combinations of two Bernstein polynomials", Mathematical Foundations of Computing, 2022. ↩
Acu, A.-M., Gupta, V., et al., "Better numerical approximation by Durrmeyer-type operators", arXiv:1810.06829 [math.NA] ↩
Khosravian-Arab, H., Dehghan, M. & Eslahchi, M.R. A new approach to improve the order of approximation of the Bernstein operators: theory and applications. Numerical Algorithms 77, 111–150 (2018). https://doi.org/10.1007/s11075-017-0307-z ↩
Ditzian, Z., Totik, V., Moduli of Smoothness, Springer, 1987. ↩
If $W_n(f; 0)=f(0)$ and $W_n(f; 1)=f(1)$ for every $n$, then the inequality $(PB)$ is automatically true when $k=0$ and $k=2n$, so that the statement has to be checked only for $0\lt k\lt 2n$. If, in addition, $W_n$ is symmetric about 1/2, so that $W_n(f; \lambda)=W_n(f; 1-\lambda)$ whenever $0\le \lambda\le 1$, then the statement has to be checked only for $0\lt k\le n$ (since the values $\sigma_{n,k,i} = {n\choose i}{n\choose {k-i}}/{2n \choose k}$ are symmetric in that they satisfy $\sigma_{n,k,i}=\sigma_{n,k,k-i}$).
Special cases for this question are if $W_n = 2 f - B_n(f)$ and $r$ is 3 or 4, or $W_n = B_n(B_n(f))+3(f-B_n(f))$ and $r$ is 5 or 6; these cases correspond to the iterated Boolean sum of Bernstein polynomials: $B_n(W_n)=f-(f-B_n(f))^k$ (where the $^k$ indicates $k$-fold nesting), which don't reproduce polynomials of higher degree than linear functions, making it hard to find a bound better than $O(1/n)$ that satisfies the conjecture when $r\ge 3$. ↩
Adcock, B., Platte, R.B., Shadrin, A., “Optimal sampling rates for approximating analytic functions from pointwise samples, IMA Journal of Numerical Analysis 39(3), July 2019. ↩
Keane, M. S., and O'Brien, G. L., "A Bernoulli factory", ACM Transactions on Modeling and Computer Simulation 4(2), 1994. ↩ ↩²
Mossel, Elchanan, and Yuval Peres. New coins from old: computing with unknown bias. Combinatorica, 25(6), pp.707-724, 2005. ↩
On pushdown automata: Etessami and Yannakakis ("Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations", Journal of the ACM 56(1), pp.1-66, 2009) showed that pushdown automata with rational probabilities are equivalent to recursive Markov chains (with rational transition probabilities), and that for every recursive Markov chain, the system of polynomial equations has nonnegative coefficients. But this paper doesn't deal with the case of recursive Markov chains where the transition probabilities cannot just be rational, but can also be $\lambda$ and $1-\lambda$ where $\lambda$ is an unknown rational or irrational probability of heads. Also, Banderier and Drmota ("Formulae and asymptotics for coefficients of algebraic functions", Combinatorics, Probability and Computing 24(1), pp.1-53., 2014) showed the asymptotic behavior of power series solutions $f(\lambda)$ of a polynomial system, where both the series and the system have nonnegative real coefficients. Notably, functions of the form $\lambda^{1/p}$ where $p\ge 3$ is not a power of 2, are not possible solutions, because their so-called "critical exponent" is not dyadic. But the result seems not to apply to piecewise power series such as $\min(\lambda,1-\lambda)$, which are likewise algebraic functions. ↩
Wästlund, J., "Functions arising by coin flipping", 1999. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open Questions on the Bernoulli Factory Problem

Background

Contents

Polynomials that approach a Bernoulli factory function "fast"

Main Question

Solving the Bernoulli factory problem with polynomials

A Matter of Efficiency

A Conjecture on Polynomial Approximation

Strategies

Other Questions

Notes

FilesExpand file tree

bernreq.md

Latest commit

History

bernreq.md

File metadata and controls

Open Questions on the Bernoulli Factory Problem

Background

Contents

Polynomials that approach a Bernoulli factory function "fast"

Main Question

Solving the Bernoulli factory problem with polynomials

A Matter of Efficiency

A Conjecture on Polynomial Approximation

Strategies

Other Questions

Notes

Footnotes