boofun.analysis.learning

Learning algorithms for Boolean functions.

This module implements algorithms for learning Boolean functions from queries or samples, including the famous Goldreich-Levin algorithm.

Key algorithms: - Goldreich-Levin: Find heavy Fourier coefficients with query access - Low-degree algorithm: Learn juntas and low-degree functions - LMN algorithm: Learn decision trees from uniform samples - Sparse Fourier learning: Learn functions with few Fourier coefficients

References: - O’Donnell Chapter 3.4: Goldreich-Levin - O’Donnell Chapter 4: Social choice and learning - Linial, Mansour, Nisan (1993): Learning decision trees

Functions

`estimate_fourier_coefficient`(f, S[, ...])	Estimate the Fourier coefficient f̂(S) using random sampling.
`find_heavy_coefficients`(f[, threshold, ...])	Find all Fourier coefficients with \|f̂(S)\| >= threshold.
`goldreich_levin`(f[, threshold, confidence, rng])	Find all heavy Fourier coefficients using the Goldreich-Levin algorithm.
`learn_sparse_fourier`(f, sparsity[, ...])	Learn a function assuming it has at most 'sparsity' non-zero Fourier coefficients.

Classes

GoldreichLevinLearner(f[, rng])

Interactive Goldreich-Levin learner with query counting.

boofun.analysis.learning.goldreich_levin(f: BooleanFunction, threshold: float = 0.1, confidence: float = 0.95, rng: np.random.Generator | None = None) → List[Tuple[int, float]][source]

Find all heavy Fourier coefficients using the Goldreich-Levin algorithm.

The algorithm finds all S such that |f̂(S)| >= threshold, with high probability.

This implementation uses the “bucketing” approach: 1. Hash coefficients into buckets 2. Find heavy buckets 3. Iteratively refine to find individual heavy coefficients

Parameters:

f – BooleanFunction to analyze
threshold – Minimum absolute coefficient value to report
confidence – Confidence level for the estimates
rng – Random number generator

Returns:

List of (subset_mask, coefficient_estimate) for heavy coefficients

Note

Query complexity: O(n / threshold^2 * log(1/confidence))

boofun.analysis.learning.estimate_fourier_coefficient(f: BooleanFunction, S: int, num_samples: int = 1000, rng: np.random.Generator | None = None) → Tuple[float, float][source]

Estimate the Fourier coefficient f̂(S) using random sampling.

Uses the identity: f̂(S) = E_x[f(x) * χ_S(x)] where χ_S(x) = (-1)^{<x,S>}

Parameters:

f – BooleanFunction to analyze (or query function)
S – Subset (as bitmask) for which to estimate coefficient
num_samples – Number of random samples
rng – Random number generator

Returns:

Tuple of (estimate, standard_error)

Raises:

ValueError – If n >= 64 (use sampling.estimate_fourier_coefficient instead)

boofun.analysis.learning.find_heavy_coefficients(f: BooleanFunction, threshold: float = 0.01, num_samples: int = 10000, rng: np.random.Generator | None = None) → Dict[int, float][source]

Find all Fourier coefficients with |f̂(S)| >= threshold.

This is a simpler alternative to Goldreich-Levin that directly estimates all coefficients (suitable for small n).

Parameters:

f – BooleanFunction to analyze
threshold – Minimum absolute coefficient value
num_samples – Samples per coefficient estimate
rng – Random number generator

Returns:

Dictionary mapping subset masks to coefficient estimates

boofun.analysis.learning.learn_sparse_fourier(f: BooleanFunction, sparsity: int, num_samples: int = 10000, rng: np.random.Generator | None = None) → Dict[int, float][source]

Learn a function assuming it has at most ‘sparsity’ non-zero Fourier coefficients.

Uses sampling to find the heavy coefficients, then refines estimates.

Parameters:

f – BooleanFunction to learn
sparsity – Maximum number of non-zero coefficients
num_samples – Number of samples for estimation
rng – Random number generator

Returns:

Dictionary of estimated non-zero Fourier coefficients

class boofun.analysis.learning.GoldreichLevinLearner(f: BooleanFunction, rng: np.random.Generator | None = None)[source]

Interactive Goldreich-Levin learner with query counting.

This class tracks the number of queries made and allows for interactive exploration of heavy Fourier coefficients.

__init__(f: BooleanFunction, rng: np.random.Generator | None = None)[source]

Initialize the learner.

Parameters:

f – BooleanFunction to learn
rng – Random number generator

query(x: int) → int[source]: Query f(x), with caching.

estimate_coefficient(S: int, num_samples: int = 1000) → float[source]: Estimate f̂(S) using queries.

find_heavy(threshold: float = 0.1) → List[Tuple[int, float]][source]: Find heavy coefficients using Goldreich-Levin.

reset_queries()[source]: Reset the query counter and cache.

summary() → str[source]: Get summary of learning progress.