Flexible Inputs and Oracle Access¶
This notebook demonstrates:
- All input formats
bf.create()accepts - All evaluation formats
f.evaluate()accepts - The Oracle Pattern - analyzing huge functions without materializing the truth table
In [1]:
# Install/upgrade boofun (required for Colab)
# This ensures you have the latest version with all features
!pip install --upgrade boofun -q
import boofun as bf
print(f"BooFun version: {bf.__version__}")
[notice] A new release of pip is available: 25.2 -> 26.0 [notice] To update, run: /Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip
/Users/gabrieltaboada/dev/Boofun/boofun/src/boofun/core/errormodels.py:21: UserWarning: uncertainties library not available - some error models disabled
warnings.warn("uncertainties library not available - some error models disabled")
/Users/gabrieltaboada/dev/Boofun/boofun/src/boofun/quantum/__init__.py:22: UserWarning: Qiskit not available - quantum features limited
warnings.warn("Qiskit not available - quantum features limited")
BooFun version: 1.1.1
In [2]:
import numpy as np
import boofun as bf
import warnings
warnings.filterwarnings('ignore')
print("boofun loaded successfully")
boofun loaded successfully
1. Flexible Function Creation¶
bf.create() auto-detects input type. No need to specify format.
In [3]:
# === TRUTH TABLES ===
# From Python list
f1 = bf.create([0, 1, 1, 0]) # XOR
print(f"List: f([0,1]) = {f1.evaluate([0, 1])}")
# From NumPy array (bool or int)
f2 = bf.create(np.array([False, True, True, False]))
print(f"NumPy bool: f([1,0]) = {f2.evaluate([1, 0])}")
# From NumPy with explicit n_vars
f3 = bf.create(np.array([0, 0, 0, 1]), n=2) # AND
print(f"NumPy int: f([1,1]) = {f3.evaluate([1, 1])}")
List: f([0,1]) = True NumPy bool: f([1,0]) = True NumPy int: f([1,1]) = True
In [4]:
# === BUILT-IN CONSTRUCTORS ===
# These are the most reliable way to create Boolean functions
# Majority function
f4 = bf.majority(5)
print(f"Majority-5: f([1,1,1,0,0]) = {f4.evaluate([1,1,1,0,0])}")
# Parity function
f5 = bf.parity(3)
print(f"Parity-3: f([1,1,0]) = {f5.evaluate([1,1,0])}")
# Threshold function
f6 = bf.threshold(5, 3) # At least 3 of 5
print(f"Threshold(5,3): f([1,1,1,0,0]) = {f6.evaluate([1,1,1,0,0])}")
Majority-5: f([1,1,1,0,0]) = True Parity-3: f([1,1,0]) = False Threshold(5,3): f([1,1,1,0,0]) = True
In [ ]:
# === BUILT-IN FUNCTIONS ===
# These provide the clearest API for common Boolean functions
f7 = bf.AND(2) # AND on 2 variables
print(f"Built-in AND: f([1,1]) = {f7.evaluate([1,1])}")
f8 = bf.parity(3) # XOR/Parity on 3 variables
print(f"Built-in XOR: f([1,1,1]) = {f8.evaluate([1,1,1])}")
f9 = bf.OR(2) # OR on 2 variables
print(f"Built-in OR: f([1,0]) = {f9.evaluate([1,0])}")
Built-in AND: f([1,1]) = True Built-in XOR: f([1,1,1]) = True Built-in OR: f([1,0]) = True
In [6]:
# === POLYNOMIAL COEFFICIENTS ===
# Dict maps monomial → coefficient
# frozenset({}) = constant term, frozenset({0}) = x₀, frozenset({0,1}) = x₀x₁
coeffs = {
frozenset(): 0, # constant
frozenset({0}): 1, # x₀
frozenset({1}): 1, # x₁
frozenset({0, 1}): -2 # -2x₀x₁ (makes XOR over reals!)
}
f10 = bf.create(coeffs, n=2)
print(f"Polynomial: coefficients define XOR")
Polynomial: coefficients define XOR
In [7]:
# === SET OF TRUE INPUTS ===
# Specify which input VECTORS evaluate to True
# As tuples: (0,1) and (1,0) are True → XOR!
true_inputs = {(0, 1), (1, 0)}
f11 = bf.create(true_inputs)
print(f"Set of True inputs: {[f11.evaluate(i) for i in range(4)]}")
Set of True inputs: [False, True, True, False]
In [8]:
# === FILE LOADING ===
import tempfile
import os
# Create a temp file
with tempfile.NamedTemporaryFile(suffix='.json', delete=False) as tmp:
path = tmp.name
# Save and reload
maj = bf.majority(3)
bf.save(maj, path)
# Load via bf.load or bf.create
f12 = bf.load(path)
f13 = bf.create(path) # Also works!
print(f"From file: n_vars = {f12.n_vars}")
os.unlink(path)
From file: n_vars = 3
2. Flexible Evaluation¶
f.evaluate() accepts many input formats.
In [9]:
f = bf.majority(3)
# Integer index (binary representation)
# 5 = 101 in binary → [1, 0, 1]
print(f"Integer: f.evaluate(5) = {f.evaluate(5)} # 5 = 101")
print(f"Integer: f.evaluate(7) = {f.evaluate(7)} # 7 = 111")
# List of bits
print(f"List: f.evaluate([1, 0, 1]) = {f.evaluate([1, 0, 1])}")
# Tuple
print(f"Tuple: f.evaluate((1, 1, 0)) = {f.evaluate((1, 1, 0))}")
# NumPy array
print(f"NumPy: f.evaluate(np.array([0, 1, 1])) = {f.evaluate(np.array([0, 1, 1]))}")
Integer: f.evaluate(5) = True # 5 = 101 Integer: f.evaluate(7) = True # 7 = 111 List: f.evaluate([1, 0, 1]) = True Tuple: f.evaluate((1, 1, 0)) = True NumPy: f.evaluate(np.array([0, 1, 1])) = True
In [10]:
# === BATCH EVALUATION ===
# 2D array: each row is one input
inputs = np.array([
[0, 0, 0],
[0, 0, 1],
[0, 1, 1],
[1, 1, 1]
])
results = f.evaluate(inputs)
print(f"Batch evaluation:")
for inp, res in zip(inputs, results):
print(f" {list(inp)} → {res}")
Batch evaluation: [0, 0, 0] → False [0, 0, 1] → False [0, 1, 1] → True [1, 1, 1] → True
3. The Oracle Pattern: Analyzing Huge Functions¶
Key insight: Many analyses don't need the full truth table!
If you provide a callable, BooFun can:
- Run BLR linearity testing via random queries
- Estimate Fourier coefficients via sampling
- Compute query complexity bounds
- Perform PAC learning with sample access
This works even for functions too large to materialize (e.g., n=30 variables has 2^30 inputs).
In [11]:
# Simulate an "oracle" - a function you can only query
# This could be an API call, database query, or expensive computation
query_count = 0
def oracle_majority(x):
"""A majority function as an oracle.
Truth table would have 2^100 ≈ 10^30 entries!
But we can still analyze it via queries.
"""
global query_count
query_count += 1
return sum(x) > 15
# Create function from oracle
# Using n=30 to avoid integer overflow in sampling algorithms
large_f = bf.create(oracle_majority, n=30)
print(f"Created 30-variable function (2^30 possible inputs)")
print(f"Queries so far: {query_count}")
Created 30-variable function (2^30 possible inputs) Queries so far: 0
In [12]:
# === LINEARITY TESTING ===
# BLR test uses O(1/ε) queries, NOT 2^n
from boofun.analysis import PropertyTester
query_count = 0
tester = PropertyTester(large_f)
# This runs BLR with ~100 random queries
is_linear = tester.blr_linearity_test(num_queries=100)
print(f"Is majority linear? {is_linear}")
print(f"Queries used: {query_count}")
print(f"Truth table size: 2^100 ≈ 1.27 × 10^30")
Is majority linear? False Queries used: 300 Truth table size: 2^100 ≈ 1.27 × 10^30
In [13]:
# === FOURIER COEFFICIENT ESTIMATION ===
# Goldreich-Levin algorithm estimates coefficients via sampling
from boofun.analysis.learning import estimate_fourier_coefficient
query_count = 0
# Estimate f̂(∅) - the bias
# For balanced functions like majority, this should be ≈ 0
est_empty, stderr = estimate_fourier_coefficient(large_f, S=0, num_samples=1000)
print(f"Estimated f̂(∅) = {est_empty:.4f} ± {stderr:.4f}")
print(f"Queries used: {query_count}")
Estimated f̂(∅) = 0.1740 ± 0.0312 Queries used: 1000
In [14]:
# === PAC LEARNING ===
# Learn approximation from samples only
# For smaller function to demo influence detection
# XOR on first 2 variables - a 2-junta on 10 vars
f_small = bf.parity(2) # Clean: just use built-in parity
# Extend to 10 variables by embedding (could use padding, but for demo just show influences)
# Actually, let's create a proper 10-variable junta
f_junta = bf.create([0,1,1,0] * 256, n=10) # XOR(x0,x1) extended to 10 vars
# Find influential variables using the direct method
influences = f_junta.influences()
threshold = 0.01
influential = [i for i, inf in enumerate(influences) if inf > threshold]
print(f"Influential variables (Inf > {threshold}): {influential}")
print(f"Influences: {[f'{inf:.4f}' for inf in influences[:4]]}... (first 4)")
print(f"(Function only uses x0 and x1 - the XOR junta pattern)")
Influential variables (Inf > 0.01): [0, 1] Influences: ['1.0000', '1.0000', '0.0000', '0.0000']... (first 4) (Function only uses x0 and x1 - the XOR junta pattern)
4. Real-World Oracle Examples¶
The oracle pattern is useful when your function is:
- An API call (e.g., ML model prediction)
- A database query (e.g., does user match criteria?)
- An expensive simulation
- A hardware test
In [15]:
# Example: Analyzing an ML classifier as a Boolean function
class FakeMLClassifier:
"""Simulates an ML model that takes binary features."""
def __init__(self):
self.weights = np.array([0.3, 0.5, 0.2, -0.4, 0.1])
def predict(self, features):
"""Returns 1 if weighted sum > 0."""
return int(np.dot(features, self.weights) > 0)
# Wrap the classifier
model = FakeMLClassifier()
classifier_fn = bf.create(model.predict, n=5)
# Now analyze it!
print("Analyzing ML classifier:")
print(f" Is it monotone? {classifier_fn.is_monotone()}")
print(f" Influences: {np.round(classifier_fn.influences(), 3)}")
print(f" (Compare to weights: {model.weights})")
Analyzing ML classifier: Is it monotone? False Influences: [0.188 0.438 0.188 0.312 0.062] (Compare to weights: [ 0.3 0.5 0.2 -0.4 0.1])
In [16]:
# Example: External service (simulated)
class ExternalService:
"""Simulates querying an external API."""
def __init__(self):
self.call_count = 0
def check_eligibility(self, criteria):
"""
Checks if user is eligible based on 4 binary criteria:
- Has account (criteria[0])
- Age verified (criteria[1])
- Payment method (criteria[2])
- Passed review (criteria[3])
Rule: Need account AND (age_verified OR passed_review) AND payment
"""
self.call_count += 1
has_account, age_verified, payment, passed_review = criteria
return has_account and (age_verified or passed_review) and payment
service = ExternalService()
eligibility = bf.create(service.check_eligibility, n=4)
# Analyze to understand the business logic
inf = eligibility.influences()
print("Eligibility function analysis:")
print(f" Influence of has_account: {inf[0]:.3f}")
print(f" Influence of age_verified: {inf[1]:.3f}")
print(f" Influence of payment: {inf[2]:.3f}")
print(f" Influence of passed_review: {inf[3]:.3f}")
print(f"\n API calls made: {service.call_count}")
Eligibility function analysis: Influence of has_account: 0.375 Influence of age_verified: 0.125 Influence of payment: 0.375 Influence of passed_review: 0.125 API calls made: 128
5. What Works Without Full Truth Table¶
| Analysis | Full TT needed? | Notes |
|---|---|---|
| BLR linearity test | No | O(1/ε) queries |
| Fourier coefficient estimation | No | O(1/ε²) samples |
| PAC learning | No | Polynomial in 1/ε |
| Goldreich-Levin | No | Finds heavy coefficients |
| Influence estimation | No | O(n/ε²) samples |
| p-biased analysis | No | Monte Carlo |
| Noise stability | No | Monte Carlo |
| Single evaluation | No | 1 query |
| Exact Fourier transform | Yes | Needs all 2^n values |
| Full truth table | Yes | By definition |
| Is junta (exact) | Yes | Needs to check all |
When you create a function from a callable, BooFun is lazy - it only computes the truth table if you explicitly request it or call an analysis that requires it.
In [17]:
# Demonstration: lazy evaluation
query_count = 0
def tracked_fn(x):
global query_count
query_count += 1
return x[0] and x[1]
f = bf.create(tracked_fn, n=3)
print(f"After creation: {query_count} queries")
# Single evaluation - 1 query
f.evaluate([1, 1, 0])
print(f"After 1 evaluation: {query_count} queries")
# Requesting truth table materializes it - 2^n queries
tt = f.get_representation('truth_table')
print(f"After truth table: {query_count} queries (2^3 = 8)")
After creation: 0 queries After 1 evaluation: 1 queries After truth table: 9 queries (2^3 = 8)
Summary¶
Flexible creation - pass data in any form:
- Lists, NumPy arrays → truth table
- Callables → query oracle
- Strings → symbolic
- Dicts → polynomial
- Files → load from disk
Flexible evaluation - query with any format:
- Integer index, lists, tuples, NumPy arrays, batches
Oracle pattern - for huge functions:
- Create from callable
- Use sampling-based analyses
- Avoid materializing 2^n truth table