Error Models for Boolean Function Analysis¶
Reference: O'Donnell, Chapters 7-8 (Learning, Noise Stability)
Real-world analysis of Boolean functions involves uncertainty. Fourier coefficients estimated from samples have finite-sample error, noise corrupts evaluations, and PAC learning introduces controlled approximation.
BooFun provides four error models that make these trade-offs explicit:
| Model | Purpose |
|---|---|
ExactErrorModel |
No uncertainty — exact computation (default) |
PACErrorModel |
Probably Approximately Correct bounds ($\varepsilon, \delta$) |
NoiseErrorModel |
Bit-flip noise at a given rate |
LinearErrorModel |
Automatic error propagation via the uncertainties library |
This notebook demonstrates each model and shows how noise affects Fourier-analytic quantities.
# Setup
!pip install --upgrade boofun -q
import numpy as np
import matplotlib.pyplot as plt
import boofun as bf
from boofun.core.errormodels import (
ExactErrorModel,
PACErrorModel,
NoiseErrorModel,
create_error_model,
)
print(f"boofun version: {bf.__version__}")
1. Exact Model — the Baseline¶
By default BooFun uses exact computation. The ExactErrorModel passes
results through unchanged and always reports full confidence.
exact = ExactErrorModel()
f = bf.majority(5)
fourier = f.fourier()
# The exact model is a no-op wrapper
result = exact.apply_error(0.42)
print(f"apply_error(0.42) = {result}")
print(f"confidence = {exact.get_confidence(result)}")
print(f"reliable = {exact.is_reliable(result)}")
# Show majority(5) Fourier spectrum as ground truth
print(f"\nMajority(5) top Fourier coefficients:")
for s in range(len(fourier)):
if abs(fourier[s]) > 0.01:
bits = format(s, f'05b')
print(f" f_hat({bits}) = {fourier[s]:.4f}")
2. PAC Error Model¶
The PAC model wraps results with $(\varepsilon, \delta)$ guarantees: with probability $\geq 1 - \delta$, the reported value is within $\varepsilon$ of the true answer.
This is the natural model for learning algorithms (Goldreich-Levin, junta learning, etc.) where you estimate Fourier coefficients from random samples.
pac_tight = PACErrorModel(epsilon=0.01, delta=0.05)
pac_loose = PACErrorModel(epsilon=0.1, delta=0.2)
true_value = 0.625 # E[majority(5)] in {0,1}
for name, model in [("tight (eps=0.01, delta=0.05)", pac_tight),
("loose (eps=0.1, delta=0.2)", pac_loose)]:
result = model.apply_error(true_value)
print(f"{name}:")
print(f" value = {result['value']}")
print(f" bounds = [{result['lower_bound']:.3f}, {result['upper_bound']:.3f}]")
print(f" confidence = {result['confidence']:.0%}")
print(f" reliable? = {model.is_reliable(true_value)}")
print()
Combining PAC bounds¶
When two PAC estimates are combined (e.g. adding influence estimates), the union bound controls the combined failure probability.
m1 = PACErrorModel(epsilon=0.05, delta=0.05)
m2 = PACErrorModel(epsilon=0.05, delta=0.05)
combined = m1.combine_pac_bounds(m1, m2, "addition")
print(f"Individual: eps={m1.epsilon}, delta={m1.delta}")
print(f"Combined: eps={combined.epsilon}, delta={combined.delta}")
print(f"Confidence: {combined.confidence:.0%}")
3. Noise Model — Bit-Flip Errors¶
The noise model simulates evaluation noise at a specified bit-flip rate $\eta$.
This connects directly to noise stability (O'Donnell Ch. 5):
$$\text{Stab}_\rho[f] = \sum_S \rho^{|S|} \hat{f}(S)^2$$
where $\rho = 1 - 2\eta$. Functions with high noise stability are robust; functions like parity are fragile.
# Compare noise impact on majority vs parity
f_maj = bf.majority(5)
f_par = bf.parity(5)
n = 5
N = 2 ** n
noise_rates = [0.0, 0.01, 0.05, 0.1, 0.2, 0.3]
results = {"majority": [], "parity": []}
np.random.seed(42)
n_trials = 500
for eta in noise_rates:
noisy = NoiseErrorModel(noise_rate=eta, random_seed=42)
for name, func in [("majority", f_maj), ("parity", f_par)]:
correct = 0
for _ in range(n_trials):
x = np.random.randint(0, N)
true_val = bool(func.evaluate(x))
noisy_val = noisy.apply_error(true_val)
if noisy_val == true_val:
correct += 1
results[name].append(correct / n_trials)
plt.figure(figsize=(8, 4))
plt.plot(noise_rates, results["majority"], 'bo-', label="Majority(5) — robust")
plt.plot(noise_rates, results["parity"], 'rs--', label="Parity(5) — fragile")
plt.xlabel("Noise rate (η)")
plt.ylabel("Fraction correct")
plt.title("Noise Robustness: Majority vs Parity")
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Compare with analytical noise stability
print("\nAnalytical noise stability (rho = 1 - 2*eta):")
for eta in [0.01, 0.1, 0.3]:
rho = 1 - 2 * eta
stab_maj = f_maj.noise_stability(rho)
stab_par = f_par.noise_stability(rho)
print(f" eta={eta:.2f}: Stab_maj={stab_maj:.4f}, Stab_par={stab_par:.4f}")
4. How Noise Affects Fourier Coefficients¶
Under the noise operator $T_\rho$, Fourier coefficients are damped:
$$\widehat{T_\rho f}(S) = \rho^{|S|} \hat{f}(S)$$
High-degree coefficients vanish exponentially. Let's visualise this for majority(5).
f = bf.majority(5)
fourier = f.fourier()
n = 5
rho_values = [1.0, 0.9, 0.7, 0.5, 0.3]
fig, axes = plt.subplots(1, len(rho_values), figsize=(16, 3), sharey=True)
for ax, rho in zip(axes, rho_values):
# Apply noise operator: dampen each coefficient by rho^|S|
damped = np.array([
fourier[s] * rho ** bin(s).count('1')
for s in range(len(fourier))
])
ax.bar(range(len(damped)), damped, color='steelblue', alpha=0.7)
ax.set_title(f"ρ = {rho}")
ax.set_xlabel("Subset S")
if ax == axes[0]:
ax.set_ylabel("Damped coeff")
fig.suptitle("Noise Operator T_ρ on Majority(5)", fontsize=13)
plt.tight_layout()
plt.show()
5. Choosing an Error Model¶
| Scenario | Recommended Model |
|---|---|
| Exact analysis of small functions (n ≤ 15) | ExactErrorModel (default) |
| Learning from random samples | PACErrorModel |
| Noisy oracle / measurement errors | NoiseErrorModel |
| Propagating uncertainty through computations | LinearErrorModel |
Use the factory to create models by name:
for model_type in ["exact", "pac", "noise"]:
if model_type == "pac":
model = create_error_model(model_type, epsilon=0.05, delta=0.05)
elif model_type == "noise":
model = create_error_model(model_type, noise_rate=0.05)
else:
model = create_error_model(model_type)
print(f"{model}")
print(f" confidence = {model.get_confidence(0.5):.2f}")
print(f" reliable = {model.is_reliable(0.5)}")
print()
Summary¶
- Exact is the default — use it when you can enumerate the full truth table.
- PAC provides principled confidence intervals for learning/estimation.
- Noise simulates the physical reality of noisy evaluation oracles.
- Linear propagates uncertainty through chains of computations.
The noise model connects directly to noise stability theory (Chapter 5) and the invariance principle (Chapter 11), while the PAC model connects to the learning results in Chapters 7-8.