boofun.analysis.sampling
Boolean functions as random variables: sampling and estimation.
This module provides a probabilistic view of Boolean functions, enabling: - Sampling from the hypercube under uniform and p-biased distributions - Spectral sampling (sampling subsets proportional to Fourier weight) - Monte Carlo estimation of Fourier coefficients and influences - Statistical analysis (expectation, variance, covariance)
This aligns with O’Donnell’s “Analysis of Boolean Functions” Chapters 1-3, which treat Boolean functions as random variables on the discrete cube.
Key concepts: - f: {-1,+1}^n → R as a random variable with E[f] = f̂(∅) - Fourier weight W^k[f] = Σ_{|S|=k} f̂(S)² forms a probability distribution - Spectral sampling: draw S with Pr[S] ∝ f̂(S)² - Monte Carlo: estimate f̂(S) = E[f(x)χ_S(x)] via sampling
References: - O’Donnell Chapter 1: Fourier expansion and basic identities - O’Donnell Chapter 2: Influences and their estimation - O’Donnell Chapter 3: Learning and sampling algorithms - Goldreich-Levin algorithm for finding heavy Fourier coefficients
Functions
|
Estimate E[f] under the (p-biased) distribution. |
|
Adaptively estimate f_hat(S) until standard error is below target. |
|
Estimate Fourier coefficient f̂(S) via Monte Carlo sampling. |
|
Estimate influence of variable i via Monte Carlo sampling. |
|
Estimate total influence I[f] = Σ_i Inf_i[f] via sampling. |
|
Estimate Var[f] under the (p-biased) distribution. |
|
Sample from the p-biased distribution μ_p on {0,1}^n. |
|
Sample (input, output) pairs from a Boolean function. |
|
Sample subsets S with probability proportional to f̂(S)². |
|
Sample uniformly from {0,1}^n. |
|
Sample uniformly from {0,1}^n, returning bit arrays. |
Classes
|
View a Boolean function as a random variable on the hypercube. |
|
Represents the spectral (Fourier weight) distribution of a Boolean function. |
- boofun.analysis.sampling.sample_uniform(n: int, n_samples: int, rng: Generator | None = None) ndarray[source]
Sample uniformly from {0,1}^n.
- Parameters:
n – Number of variables
n_samples – Number of samples to draw
rng – Random number generator (default: numpy default)
- Returns:
Array of shape (n_samples,) with integer inputs in [0, 2^n)
- Raises:
ValueError – If n >= 64 (integer overflow limitation)
Note
For n >= 64, the range [0, 2^n) exceeds int64 capacity. For such large functions, consider: - Using sample_uniform_bits() which returns bit arrays - Oracle-based evaluation patterns - Streaming/lazy evaluation approaches
- boofun.analysis.sampling.sample_biased(n: int, p: float, n_samples: int, rng: Generator | None = None) ndarray[source]
Sample from the p-biased distribution μ_p on {0,1}^n.
Each coordinate is independently 1 with probability p, 0 with probability 1-p.
- Parameters:
n – Number of variables
p – Bias parameter (Pr[bit = 1])
n_samples – Number of samples
rng – Random number generator
- Returns:
Array of shape (n_samples,) with integer inputs
- boofun.analysis.sampling.sample_spectral(f: BooleanFunction, n_samples: int, rng: np.random.Generator | None = None) np.ndarray[source]
Sample subsets S with probability proportional to f̂(S)².
This is “spectral sampling” - drawing from the Fourier weight distribution. By Parseval’s identity, this is a valid probability distribution for non-constant functions (sum of weights = 1 for ±1-valued functions).
Useful for: - Finding heavy Fourier coefficients - Learning algorithms (Goldreich-Levin) - Analyzing spectral structure
- Parameters:
f – BooleanFunction to sample from
n_samples – Number of samples
rng – Random number generator
- Returns:
Array of shape (n_samples,) with subset masks (integers)
- boofun.analysis.sampling.sample_input_output_pairs(f: BooleanFunction, n_samples: int, p: float = 0.5, rng: np.random.Generator | None = None) Tuple[np.ndarray, np.ndarray][source]
Sample (input, output) pairs from a Boolean function.
Draws inputs from the p-biased distribution and evaluates f.
- Parameters:
f – BooleanFunction to sample from
n_samples – Number of samples
p – Bias parameter (0.5 = uniform)
rng – Random number generator
- Returns:
inputs: array of shape (n_samples,) with integer inputs
outputs: array of shape (n_samples,) with function values in {0,1}
- Return type:
Tuple of (inputs, outputs) where
- boofun.analysis.sampling.estimate_fourier_coefficient(f: BooleanFunction, S: int, n_samples: int, p: float = 0.5, rng: np.random.Generator | None = None, return_confidence: bool = False) float | Tuple[float, float][source]
Estimate Fourier coefficient f̂(S) via Monte Carlo sampling.
Uses the identity f̂(S) = E_x[f(x) χ_S(x)] where χ_S(x) = (-1)^{⟨x,S⟩}.
For p-biased sampling, uses the adjusted formula from O’Donnell Chapter 8.
- Parameters:
f – BooleanFunction to analyze
S – Subset mask (integer)
n_samples – Number of samples
p – Bias parameter (0.5 = uniform)
rng – Random number generator
return_confidence – If True, return (estimate, std_error)
- Returns:
Estimated f̂(S), or (estimate, std_error) if return_confidence=True
Note
Error scales as O(1/√n_samples). For 1% relative error, need ~10000 samples.
- boofun.analysis.sampling.estimate_influence(f: BooleanFunction, i: int, n_samples: int, rng: np.random.Generator | None = None, return_confidence: bool = False) float | Tuple[float, float][source]
Estimate influence of variable i via Monte Carlo sampling.
Uses the identity Inf_i[f] = Pr_x[f(x) != f(x^{(i)})].
- Parameters:
f – BooleanFunction to analyze
i – Variable index (0-indexed)
n_samples – Number of samples
rng – Random number generator
return_confidence – If True, return (estimate, std_error)
- Returns:
Estimated Inf_i[f], or (estimate, std_error) if return_confidence=True
- boofun.analysis.sampling.estimate_expectation(f: BooleanFunction, n_samples: int, p: float = 0.5, rng: np.random.Generator | None = None) float[source]
Estimate E[f] under the (p-biased) distribution.
For uniform distribution, E[f] = f̂(∅) = bias of f in ±1 representation.
- Parameters:
f – BooleanFunction (treated as ±1-valued)
n_samples – Number of samples
p – Bias parameter
rng – Random number generator
- Returns:
Estimated E[f]
- boofun.analysis.sampling.estimate_variance(f: BooleanFunction, n_samples: int, p: float = 0.5, rng: np.random.Generator | None = None) float[source]
Estimate Var[f] under the (p-biased) distribution.
For ±1-valued functions, Var[f] = 1 - E[f]² = Σ_{S≠∅} f̂(S)².
- Parameters:
f – BooleanFunction
n_samples – Number of samples
p – Bias parameter
rng – Random number generator
- Returns:
Estimated Var[f]
- boofun.analysis.sampling.estimate_total_influence(f: BooleanFunction, n_samples: int, rng: np.random.Generator | None = None) float[source]
Estimate total influence I[f] = Σ_i Inf_i[f] via sampling.
Uses the identity I[f] = E[s(f,x)] where s(f,x) is the sensitivity at x.
- Parameters:
f – BooleanFunction
n_samples – Number of samples
rng – Random number generator
- Returns:
Estimated total influence
- class boofun.analysis.sampling.RandomVariableView(f: BooleanFunction, p: float = 0.5)[source]
View a Boolean function as a random variable on the hypercube.
This class provides a probabilistic interface to Boolean functions, supporting operations like expectation, variance, and sampling.
In O’Donnell’s framework: - f: {-1,+1}^n → R is a random variable - E[f] = f̂(∅) - Var[f] = Σ_{S≠∅} f̂(S)² - f = Σ_S f̂(S) χ_S is an orthonormal decomposition
- function
The underlying BooleanFunction
- p
Bias parameter for sampling (default 0.5 = uniform)
- __init__(f: BooleanFunction, p: float = 0.5)[source]
Initialize random variable view.
- Parameters:
f – BooleanFunction to view as random variable
p – Bias parameter (default 0.5 = uniform distribution)
- seed(seed: int) RandomVariableView[source]
Set random seed for reproducibility.
- property spectral_distribution: SpectralDistribution
Get the spectral distribution (cached).
- sample(n_samples: int) Tuple[ndarray, ndarray][source]
Sample (input, output) pairs.
- Parameters:
n_samples – Number of samples
- Returns:
Tuple of (inputs, outputs) arrays
- sample_spectral(n_samples: int) ndarray[source]
Sample subsets from the Fourier weight distribution.
- estimate_fourier_coefficient(S: int, n_samples: int, return_confidence: bool = False) float | Tuple[float, float][source]
Estimate f̂(S) via sampling.
- estimate_influence(i: int, n_samples: int, return_confidence: bool = False) float | Tuple[float, float][source]
Estimate Inf_i[f] via sampling.
- class boofun.analysis.sampling.SpectralDistribution(weights: ndarray, probabilities: ndarray, n_vars: int)[source]
Represents the spectral (Fourier weight) distribution of a Boolean function.
The spectral distribution has: - Support: all subsets S ⊆ [n] - Probabilities: Pr[S] = f̂(S)² / Σ_T f̂(T)²
For ±1-valued functions, Σ_S f̂(S)² = 1 by Parseval.
- weights
Array of f̂(S)² values indexed by subset
- Type:
- probabilities
Normalized probabilities
- Type:
- classmethod from_function(f: BooleanFunction) SpectralDistribution[source]
Create spectral distribution from a Boolean function.
- sample(n_samples: int, rng: Generator | None = None) ndarray[source]
Sample subsets from the spectral distribution.