Flexible Inputs and Oracle Access¶

This notebook demonstrates:

  1. All input formats bf.create() accepts
  2. All evaluation formats f.evaluate() accepts
  3. The Oracle Pattern - analyzing huge functions without materializing the truth table

In [1]:
# Install/upgrade boofun (required for Colab)
# This ensures you have the latest version with all features
!pip install --upgrade boofun -q

import boofun as bf
print(f"BooFun version: {bf.__version__}")
[notice] A new release of pip is available: 25.2 -> 26.0
[notice] To update, run: /Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip
/Users/gabrieltaboada/dev/Boofun/boofun/src/boofun/core/errormodels.py:21: UserWarning: uncertainties library not available - some error models disabled
  warnings.warn("uncertainties library not available - some error models disabled")
/Users/gabrieltaboada/dev/Boofun/boofun/src/boofun/quantum/__init__.py:22: UserWarning: Qiskit not available - quantum features limited
  warnings.warn("Qiskit not available - quantum features limited")
BooFun version: 1.1.1
In [2]:
import numpy as np
import boofun as bf

import warnings
warnings.filterwarnings('ignore')
print("boofun loaded successfully")
boofun loaded successfully

1. Flexible Function Creation¶

bf.create() auto-detects input type. No need to specify format.

In [3]:
# === TRUTH TABLES ===

# From Python list
f1 = bf.create([0, 1, 1, 0])  # XOR
print(f"List: f([0,1]) = {f1.evaluate([0, 1])}")

# From NumPy array (bool or int)
f2 = bf.create(np.array([False, True, True, False]))
print(f"NumPy bool: f([1,0]) = {f2.evaluate([1, 0])}")

# From NumPy with explicit n_vars
f3 = bf.create(np.array([0, 0, 0, 1]), n=2)  # AND
print(f"NumPy int: f([1,1]) = {f3.evaluate([1, 1])}")
List: f([0,1]) = True
NumPy bool: f([1,0]) = True
NumPy int: f([1,1]) = True
In [4]:
# === BUILT-IN CONSTRUCTORS ===
# These are the most reliable way to create Boolean functions

# Majority function
f4 = bf.majority(5)
print(f"Majority-5: f([1,1,1,0,0]) = {f4.evaluate([1,1,1,0,0])}")

# Parity function
f5 = bf.parity(3)
print(f"Parity-3: f([1,1,0]) = {f5.evaluate([1,1,0])}")

# Threshold function
f6 = bf.threshold(5, 3)  # At least 3 of 5
print(f"Threshold(5,3): f([1,1,1,0,0]) = {f6.evaluate([1,1,1,0,0])}")
Majority-5: f([1,1,1,0,0]) = True
Parity-3: f([1,1,0]) = False
Threshold(5,3): f([1,1,1,0,0]) = True
In [ ]:
# === BUILT-IN FUNCTIONS ===
# These provide the clearest API for common Boolean functions

f7 = bf.AND(2)  # AND on 2 variables
print(f"Built-in AND: f([1,1]) = {f7.evaluate([1,1])}")

f8 = bf.parity(3)  # XOR/Parity on 3 variables
print(f"Built-in XOR: f([1,1,1]) = {f8.evaluate([1,1,1])}")

f9 = bf.OR(2)  # OR on 2 variables
print(f"Built-in OR: f([1,0]) = {f9.evaluate([1,0])}")
Built-in AND: f([1,1]) = True
Built-in XOR: f([1,1,1]) = True
Built-in OR: f([1,0]) = True
In [6]:
# === POLYNOMIAL COEFFICIENTS ===

# Dict maps monomial → coefficient
# frozenset({}) = constant term, frozenset({0}) = x₀, frozenset({0,1}) = x₀x₁
coeffs = {
    frozenset(): 0,           # constant
    frozenset({0}): 1,        # x₀
    frozenset({1}): 1,        # x₁  
    frozenset({0, 1}): -2     # -2x₀x₁ (makes XOR over reals!)
}
f10 = bf.create(coeffs, n=2)
print(f"Polynomial: coefficients define XOR")
Polynomial: coefficients define XOR
In [7]:
# === SET OF TRUE INPUTS ===

# Specify which input VECTORS evaluate to True
# As tuples: (0,1) and (1,0) are True → XOR!
true_inputs = {(0, 1), (1, 0)}  
f11 = bf.create(true_inputs)
print(f"Set of True inputs: {[f11.evaluate(i) for i in range(4)]}")
Set of True inputs: [False, True, True, False]
In [8]:
# === FILE LOADING ===
import tempfile
import os

# Create a temp file
with tempfile.NamedTemporaryFile(suffix='.json', delete=False) as tmp:
    path = tmp.name

# Save and reload
maj = bf.majority(3)
bf.save(maj, path)

# Load via bf.load or bf.create
f12 = bf.load(path)
f13 = bf.create(path)  # Also works!

print(f"From file: n_vars = {f12.n_vars}")
os.unlink(path)
From file: n_vars = 3

2. Flexible Evaluation¶

f.evaluate() accepts many input formats.

In [9]:
f = bf.majority(3)

# Integer index (binary representation)
# 5 = 101 in binary → [1, 0, 1]
print(f"Integer: f.evaluate(5) = {f.evaluate(5)}  # 5 = 101")
print(f"Integer: f.evaluate(7) = {f.evaluate(7)}  # 7 = 111")

# List of bits
print(f"List: f.evaluate([1, 0, 1]) = {f.evaluate([1, 0, 1])}")

# Tuple
print(f"Tuple: f.evaluate((1, 1, 0)) = {f.evaluate((1, 1, 0))}")

# NumPy array
print(f"NumPy: f.evaluate(np.array([0, 1, 1])) = {f.evaluate(np.array([0, 1, 1]))}")
Integer: f.evaluate(5) = True  # 5 = 101
Integer: f.evaluate(7) = True  # 7 = 111
List: f.evaluate([1, 0, 1]) = True
Tuple: f.evaluate((1, 1, 0)) = True
NumPy: f.evaluate(np.array([0, 1, 1])) = True
In [10]:
# === BATCH EVALUATION ===

# 2D array: each row is one input
inputs = np.array([
    [0, 0, 0],
    [0, 0, 1],
    [0, 1, 1],
    [1, 1, 1]
])

results = f.evaluate(inputs)
print(f"Batch evaluation:")
for inp, res in zip(inputs, results):
    print(f"  {list(inp)} → {res}")
Batch evaluation:
  [0, 0, 0] → False
  [0, 0, 1] → False
  [0, 1, 1] → True
  [1, 1, 1] → True

3. The Oracle Pattern: Analyzing Huge Functions¶

Key insight: Many analyses don't need the full truth table!

If you provide a callable, BooFun can:

  • Run BLR linearity testing via random queries
  • Estimate Fourier coefficients via sampling
  • Compute query complexity bounds
  • Perform PAC learning with sample access

This works even for functions too large to materialize (e.g., n=30 variables has 2^30 inputs).

In [11]:
# Simulate an "oracle" - a function you can only query
# This could be an API call, database query, or expensive computation

query_count = 0

def oracle_majority(x):
    """A majority function as an oracle.
    
    Truth table would have 2^100 ≈ 10^30 entries!
    But we can still analyze it via queries.
    """
    global query_count
    query_count += 1
    return sum(x) > 15

# Create function from oracle
# Using n=30 to avoid integer overflow in sampling algorithms
large_f = bf.create(oracle_majority, n=30)
print(f"Created 30-variable function (2^30 possible inputs)")
print(f"Queries so far: {query_count}")
Created 30-variable function (2^30 possible inputs)
Queries so far: 0
In [12]:
# === LINEARITY TESTING ===
# BLR test uses O(1/ε) queries, NOT 2^n

from boofun.analysis import PropertyTester

query_count = 0
tester = PropertyTester(large_f)

# This runs BLR with ~100 random queries
is_linear = tester.blr_linearity_test(num_queries=100)

print(f"Is majority linear? {is_linear}")
print(f"Queries used: {query_count}")
print(f"Truth table size: 2^100 ≈ 1.27 × 10^30")
Is majority linear? False
Queries used: 300
Truth table size: 2^100 ≈ 1.27 × 10^30
In [13]:
# === FOURIER COEFFICIENT ESTIMATION ===
# Goldreich-Levin algorithm estimates coefficients via sampling

from boofun.analysis.learning import estimate_fourier_coefficient

query_count = 0

# Estimate f̂(∅) - the bias
# For balanced functions like majority, this should be ≈ 0
est_empty, stderr = estimate_fourier_coefficient(large_f, S=0, num_samples=1000)

print(f"Estimated f̂(∅) = {est_empty:.4f} ± {stderr:.4f}")
print(f"Queries used: {query_count}")
Estimated f̂(∅) = 0.1740 ± 0.0312
Queries used: 1000
In [14]:
# === PAC LEARNING ===
# Learn approximation from samples only

# For smaller function to demo influence detection
# XOR on first 2 variables - a 2-junta on 10 vars
f_small = bf.parity(2)  # Clean: just use built-in parity

# Extend to 10 variables by embedding (could use padding, but for demo just show influences)
# Actually, let's create a proper 10-variable junta
f_junta = bf.create([0,1,1,0] * 256, n=10)  # XOR(x0,x1) extended to 10 vars

# Find influential variables using the direct method
influences = f_junta.influences()
threshold = 0.01
influential = [i for i, inf in enumerate(influences) if inf > threshold]

print(f"Influential variables (Inf > {threshold}): {influential}")
print(f"Influences: {[f'{inf:.4f}' for inf in influences[:4]]}... (first 4)")
print(f"(Function only uses x0 and x1 - the XOR junta pattern)")
Influential variables (Inf > 0.01): [0, 1]
Influences: ['1.0000', '1.0000', '0.0000', '0.0000']... (first 4)
(Function only uses x0 and x1 - the XOR junta pattern)

4. Real-World Oracle Examples¶

The oracle pattern is useful when your function is:

  • An API call (e.g., ML model prediction)
  • A database query (e.g., does user match criteria?)
  • An expensive simulation
  • A hardware test
In [15]:
# Example: Analyzing an ML classifier as a Boolean function

class FakeMLClassifier:
    """Simulates an ML model that takes binary features."""
    def __init__(self):
        self.weights = np.array([0.3, 0.5, 0.2, -0.4, 0.1])
        
    def predict(self, features):
        """Returns 1 if weighted sum > 0."""
        return int(np.dot(features, self.weights) > 0)

# Wrap the classifier
model = FakeMLClassifier()
classifier_fn = bf.create(model.predict, n=5)

# Now analyze it!
print("Analyzing ML classifier:")
print(f"  Is it monotone? {classifier_fn.is_monotone()}")
print(f"  Influences: {np.round(classifier_fn.influences(), 3)}")
print(f"  (Compare to weights: {model.weights})")
Analyzing ML classifier:
  Is it monotone? False
  Influences: [0.188 0.438 0.188 0.312 0.062]
  (Compare to weights: [ 0.3  0.5  0.2 -0.4  0.1])
In [16]:
# Example: External service (simulated)

class ExternalService:
    """Simulates querying an external API."""
    def __init__(self):
        self.call_count = 0
        
    def check_eligibility(self, criteria):
        """
        Checks if user is eligible based on 4 binary criteria:
        - Has account (criteria[0])
        - Age verified (criteria[1])
        - Payment method (criteria[2])
        - Passed review (criteria[3])
        
        Rule: Need account AND (age_verified OR passed_review) AND payment
        """
        self.call_count += 1
        has_account, age_verified, payment, passed_review = criteria
        return has_account and (age_verified or passed_review) and payment

service = ExternalService()
eligibility = bf.create(service.check_eligibility, n=4)

# Analyze to understand the business logic
inf = eligibility.influences()
print("Eligibility function analysis:")
print(f"  Influence of has_account: {inf[0]:.3f}")
print(f"  Influence of age_verified: {inf[1]:.3f}")
print(f"  Influence of payment: {inf[2]:.3f}")
print(f"  Influence of passed_review: {inf[3]:.3f}")
print(f"\n  API calls made: {service.call_count}")
Eligibility function analysis:
  Influence of has_account: 0.375
  Influence of age_verified: 0.125
  Influence of payment: 0.375
  Influence of passed_review: 0.125

  API calls made: 128

5. What Works Without Full Truth Table¶

Analysis Full TT needed? Notes
BLR linearity test No O(1/ε) queries
Fourier coefficient estimation No O(1/ε²) samples
PAC learning No Polynomial in 1/ε
Goldreich-Levin No Finds heavy coefficients
Influence estimation No O(n/ε²) samples
p-biased analysis No Monte Carlo
Noise stability No Monte Carlo
Single evaluation No 1 query
Exact Fourier transform Yes Needs all 2^n values
Full truth table Yes By definition
Is junta (exact) Yes Needs to check all

When you create a function from a callable, BooFun is lazy - it only computes the truth table if you explicitly request it or call an analysis that requires it.

In [17]:
# Demonstration: lazy evaluation

query_count = 0

def tracked_fn(x):
    global query_count
    query_count += 1
    return x[0] and x[1]

f = bf.create(tracked_fn, n=3)
print(f"After creation: {query_count} queries")

# Single evaluation - 1 query
f.evaluate([1, 1, 0])
print(f"After 1 evaluation: {query_count} queries")

# Requesting truth table materializes it - 2^n queries
tt = f.get_representation('truth_table')
print(f"After truth table: {query_count} queries (2^3 = 8)")
After creation: 0 queries
After 1 evaluation: 1 queries
After truth table: 9 queries (2^3 = 8)

Summary¶

Flexible creation - pass data in any form:

  • Lists, NumPy arrays → truth table
  • Callables → query oracle
  • Strings → symbolic
  • Dicts → polynomial
  • Files → load from disk

Flexible evaluation - query with any format:

  • Integer index, lists, tuples, NumPy arrays, batches

Oracle pattern - for huge functions:

  • Create from callable
  • Use sampling-based analyses
  • Avoid materializing 2^n truth table