Problem Set 2: LTFs, Decision Trees, and Influences¶
CS 294-92: Analysis of Boolean Functions - Spring 2025
Due: Monday, February 24, 11:59PM
Notebook by: Gabriel Taboada
This notebook provides computational exploration of the concepts in HW2:
- Majority as the unique symmetric, monotone, odd voting rule
- Linear Threshold Functions (LTFs) and influence ordering
- Decision tree Fourier concentration
- Influence characterization via resampling
- Goldreich-Levin and pseudorandom generators
# Install/upgrade boofun (required for Colab)
# This ensures you have the latest version with all features
!pip install --upgrade boofun -q
import boofun as bf
print(f"BooFun version: {bf.__version__}")
[notice] A new release of pip is available: 25.2 -> 26.0 [notice] To update, run: /Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip
/Users/gabrieltaboada/dev/Boofun/boofun/src/boofun/core/errormodels.py:21: UserWarning: uncertainties library not available - some error models disabled
warnings.warn("uncertainties library not available - some error models disabled")
/Users/gabrieltaboada/dev/Boofun/boofun/src/boofun/quantum/__init__.py:22: UserWarning: Qiskit not available - quantum features limited
warnings.warn("Qiskit not available - quantum features limited")
BooFun version: 1.1.1
import numpy as np
import boofun as bf
from boofun.analysis import PropertyTester, ltf_analysis
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
Problem 1: Majority is Unique¶
Statement: For odd $n$, the only voting rule that is symmetric, monotone, and odd is Majority.
- Symmetric: $f(x_{\pi(1)}, \ldots, x_{\pi(n)}) = f(x_1, \ldots, x_n)$ for all permutations $\pi$
- Monotone: $x \leq y \Rightarrow f(x) \leq f(y)$
- Odd: $f(-x) = -f(x)$
Let's verify computationally that Majority satisfies all three, and that other candidates don't:
def is_symmetric(f):
"""Check if f is symmetric (invariant under permutations)."""
n = f.n_vars
tt = np.asarray(f.get_representation("truth_table"))
# For symmetric functions, output depends only on Hamming weight
for i in range(2**n):
bits = [int(b) for b in format(i, f'0{n}b')]
weight = sum(bits)
# Find another input with same weight
for j in range(i+1, 2**n):
bits_j = [int(b) for b in format(j, f'0{n}b')]
if sum(bits_j) == weight and tt[i] != tt[j]:
return False
return True
def is_odd(f):
"""Check if f(-x) = -f(x), i.e., f is balanced and antiself-dual."""
n = f.n_vars
tt = np.asarray(f.get_representation("truth_table"))
for i in range(2**n):
neg_i = 2**n - 1 - i # Flip all bits
# In {0,1}: f(x) ≠ f(~x) means odd in ±1
if tt[i] == tt[neg_i]:
return False
return True
# Test various n=5 functions
n = 5
functions = {
"Majority₅": bf.majority(n),
"Parity₅": bf.parity(n),
"AND₅": bf.AND(n),
"OR₅": bf.OR(n),
"Dictator₅": bf.dictator(n, 0),
"Threshold(3)": bf.threshold(n, 3), # At least 3 of 5 true
}
print("Testing symmetric, monotone, odd properties for n=5:")
print(f"{'Function':<15} | {'Symmetric':<10} | {'Monotone':<10} | {'Odd':<10}")
print("-" * 55)
for name, f in functions.items():
sym = is_symmetric(f)
mono = f.is_monotone()
odd = is_odd(f)
all_three = "[ALL THREE]" if (sym and mono and odd) else ""
print(f"{name:<15} | {str(sym):<10} | {str(mono):<10} | {str(odd):<10} {all_three}")
Testing symmetric, monotone, odd properties for n=5: Function | Symmetric | Monotone | Odd ------------------------------------------------------- Majority₅ | True | True | True [ALL THREE] Parity₅ | True | False | True AND₅ | True | True | False OR₅ | True | True | False Dictator₅ | False | True | True Threshold(3) | True | True | True [ALL THREE]
# Now we can use bf.weighted_majority() directly!
# Example: weights [5, 3, 1]
weights = [5, 3, 1]
f = bf.weighted_majority(weights)
# Direct API: f.influences()
influences = f.influences()
print("LTF with weights [5, 3, 1]:")
print(f"{'Variable':<10} | {'|Weight|':<10} | {'Influence':<10}")
print("-" * 35)
for i, (w, inf) in enumerate(zip(weights, influences)):
print(f"x_{i+1:<8} | {abs(w):<10} | {inf:<10.4f}")
# Verify ordering
sorted_by_weight = sorted(range(len(weights)), key=lambda i: -abs(weights[i]))
sorted_by_inf = sorted(range(len(influences)), key=lambda i: -influences[i])
print(f"\nOrdering by |weight|: {sorted_by_weight}")
print(f"Ordering by influence: {sorted_by_inf}")
print(f"Same ordering: {'yes' if sorted_by_weight == sorted_by_inf else 'no'}")
LTF with weights [5, 3, 1]: Variable | |Weight| | Influence ----------------------------------- x_1 | 5 | 1.0000 x_2 | 3 | 0.0000 x_3 | 1 | 0.0000 Ordering by |weight|: [0, 1, 2] Ordering by influence: [0, 1, 2] Same ordering: yes
Part (b): Nassau County Voting System¶
The Nassau County Board has 6 districts with voting weights:
| District | Votes |
|---|---|
| Hempstead #1 | 31 |
| Hempstead #2 | 31 |
| Oyster Bay | 28 |
| North Hempstead | 21 |
| Long Beach | 2 |
| Glen Cove | 2 |
The function is $f(x) = \text{sign}(31x_1 + 31x_2 + 28x_3 + 21x_4 + 2x_5 + 2x_6)$
Let's compute the influences!
# Nassau County voting system - now using bf.weighted_majority()
weights = [31, 31, 28, 21, 2, 2]
districts = ["Hempstead #1", "Hempstead #2", "Oyster Bay",
"North Hempstead", "Long Beach", "Glen Cove"]
nassau = bf.weighted_majority(weights)
# Full LTF analysis using the new module
analysis = ltf_analysis.analyze_ltf(nassau)
print(analysis.summary())
print("\n" + "-" * 40)
print("\nInfluences by District:")
# Direct API: f.influences()
influences = nassau.influences()
for district, votes, inf in zip(districts, weights, influences):
status = "[ok]" if inf > 0.001 else "[DUMMY]"
print(f" {district:<16} | {votes:>3} votes | Inf={inf:.3f} {status}")
# Use library function to find dummy voters
print("\n⚠️ DUMMY VOTERS (zero influence):")
dummies = ltf_analysis.dummy_voters(nassau)
print(f" {[districts[i] for i in dummies]}")
print(" These voters can NEVER change the outcome!")
LTF Analysis ======================================== Is LTF: True Weights: [-0. -0. -0. -2. -2. -2.] Threshold: -3.0000 Chow params: [ 0. 0. 0. 0. -0.5 -0.5 -0.5] Critical index: 2 Regularity τ: 0.5774 ---------------------------------------- Influences by District: Hempstead #1 | 31 votes | Inf=0.500 [ok] Hempstead #2 | 31 votes | Inf=0.500 [ok] Oyster Bay | 28 votes | Inf=0.500 [ok] North Hempstead | 21 votes | Inf=0.000 [DUMMY] Long Beach | 2 votes | Inf=0.000 [DUMMY] Glen Cove | 2 votes | Inf=0.000 [DUMMY] ⚠️ DUMMY VOTERS (zero influence): ['North Hempstead', 'Long Beach', 'Glen Cove'] These voters can NEVER change the outcome!
LTF Theory: Not All Functions are LTFs!¶
Key insight: LTFs are exactly the linearly separable functions.
XOR/Parity is famously NOT an LTF - there's no hyperplane that separates inputs with even parity from those with odd parity.
# Test which functions are LTFs
test_fns = {
"AND₃": bf.AND(3),
"OR₃": bf.OR(3),
"Majority₃": bf.majority(3),
"Parity₃ (XOR)": bf.parity(3),
"Threshold(4,2)": bf.threshold(4, 2),
"Dictator": bf.dictator(3, 0),
}
print("Which functions are LTFs?")
print("-" * 40)
for name, f in test_fns.items():
is_ltf = ltf_analysis.is_ltf(f)
status = "LTF" if is_ltf else "not LTF"
print(f" {name:<18}: {status}")
Which functions are LTFs? ---------------------------------------- AND₃ : LTF OR₃ : LTF Majority₃ : LTF Parity₃ (XOR) : not LTF Threshold(4,2) : LTF Dictator : LTF
Problem 3: Decision Tree Fourier Concentration¶
Key results about decision trees $T$ of depth $d$ and size $s$:
(a) At most $4^d$ non-zero Fourier coefficients
(b) $\sum_{S} |\hat{T}(S)| \leq s$ (spectral norm ≤ size)
(c) Fourier spectrum is $\varepsilon$-concentrated on at most $(s/\varepsilon)^2$ coefficients
(d) Fourier spectrum is $4\varepsilon$-concentrated up to degree $\log_2(s/\varepsilon)$
Let's verify these with examples:
# Functions that can be computed by small decision trees
# AND has depth n, size 2
# Majority has depth n, size ≈ 2^n for exact computation
# Create functions and analyze
test_fns = {
"AND₃ (depth 3, size 2)": (bf.AND(3), 3, 2),
"OR₃ (depth 3, size 2)": (bf.OR(3), 3, 2),
"Parity₃ (depth 3, size 2³)": (bf.parity(3), 3, 8),
"Majority₃ (depth 3, size ≤8)": (bf.majority(3), 3, 8),
}
print("Decision Tree Fourier Analysis:")
print(f"{'Function':<25} | {'Depth d':<8} | {'Size s':<8} | {'#Non-zero':<10} | {'≤4^d?':<7} | {'Σ|f̂|':<8} | {'≤s?'}")
print("-" * 90)
for name, (f, depth, size) in test_fns.items():
# Direct API: f.fourier()
fourier = f.fourier()
# Count non-zero coefficients
nonzero = sum(1 for c in fourier if abs(c) > 1e-10)
bound_4d = 4**depth
# Sum of absolute values
l1_norm = sum(abs(c) for c in fourier)
check_4d = "ok" if nonzero <= bound_4d else "fail"
check_s = "ok" if l1_norm <= size + 0.01 else "fail"
print(f"{name:<25} | {depth:<8} | {size:<8} | {nonzero:<10} | {check_4d:<7} | {l1_norm:<8.3f} | {check_s}")
Decision Tree Fourier Analysis: Function | Depth d | Size s | #Non-zero | ≤4^d? | Σ|f̂| | ≤s? ------------------------------------------------------------------------------------------ AND₃ (depth 3, size 2) | 3 | 2 | 8 | ok | 2.500 | fail OR₃ (depth 3, size 2) | 3 | 2 | 8 | ok | 2.500 | fail Parity₃ (depth 3, size 2³) | 3 | 8 | 1 | ok | 1.000 | ok Majority₃ (depth 3, size ≤8) | 3 | 8 | 4 | ok | 2.000 | ok
Problem 4: Influence via Resampling¶
Alternative characterizations of influence and variance:
(a) $\mathbf{Inf}_i[f] = 2 \cdot \Pr_{X, X^i}[f(X) \neq f(X^i)]$
where $X^i$ is $X$ with the $i$-th coordinate resampled independently.
(b) $\mathbf{Var}[f] = 2 \cdot \Pr_{X,Y}[f(X) \neq f(Y)]$
where $X, Y$ are independent uniform.
(c) Poincaré inequality via hybrids: $\mathbf{Var}[f] \leq \mathbf{I}[f]$
def influence_via_resampling(f, i, trials=10000):
"""Compute Inf_i[f] = 2 * Pr[f(X) ≠ f(X^i)]."""
n = f.n_vars
disagree = 0
for _ in range(trials):
x = np.random.randint(0, 2, n)
x_i = x.copy()
x_i[i] = np.random.randint(0, 2) # Resample i-th coordinate
if f.evaluate(x) != f.evaluate(x_i):
disagree += 1
return 2 * disagree / trials
def variance_via_disagreement(f, trials=10000):
"""Compute Var[f] = 2 * Pr[f(X) ≠ f(Y)]."""
n = f.n_vars
disagree = 0
for _ in range(trials):
x = np.random.randint(0, 2, n)
y = np.random.randint(0, 2, n)
if f.evaluate(x) != f.evaluate(y):
disagree += 1
return 2 * disagree / trials
# Test on Majority
f = bf.majority(5)
# Direct API: f.fourier() and f.influences()
fourier = f.fourier()
true_influences = f.influences()
true_var = 1 - fourier[0]**2
print("Influence via Resampling (Majority₅):")
print(f"{'Variable':<10} | {'True Inf':<12} | {'Via Resample':<12}")
print("-" * 40)
for i in range(f.n_vars):
resample_inf = influence_via_resampling(f, i)
print(f"x_{i+1:<8} | {true_influences[i]:<12.4f} | {resample_inf:<12.4f}")
print(f"\nVariance:")
print(f" True: {true_var:.4f}")
print(f" Via disagreement: {variance_via_disagreement(f):.4f}")
# Poincaré inequality
total_inf = sum(true_influences)
print(f"\nPoincaré inequality: Var ≤ I[f]")
print(f" {true_var:.4f} <= {total_inf:.4f} ? {'yes' if true_var <= total_inf + 0.01 else 'no'}")
Influence via Resampling (Majority₅): Variable | True Inf | Via Resample ----------------------------------------
x_1 | 0.3750 | 0.3708
x_2 | 0.3750 | 0.3522
x_3 | 0.3750 | 0.3816
x_4 | 0.3750 | 0.3690
x_5 | 0.3750 | 0.3788 Variance: True: 1.0000 Via disagreement: 1.0026 Poincaré inequality: Var ≤ I[f] 1.0000 <= 1.8750 ? yes
Problem 5: Goldreich-Levin and Pseudorandom Generators¶
The Goldreich-Levin Theorem shows how to construct pseudorandom generators from one-way permutations.
Given:
- One-way permutation $f: \mathbb{F}_2^n \to \mathbb{F}_2^n$
- PRG construction: $g(r, s) = (r, f(s), r \cdot s)$
Key insight: The inner product $r \cdot s = \sum_i r_i s_i \mod 2$ (a.k.a. parity) serves as a hardcore predicate.
If an adversary can predict $r \cdot s$ from $(r, f(s))$ with advantage $\gamma$, then Goldreich-Levin can invert $f$ with non-negligible probability!
# Demonstrate connection to Fourier analysis
# For fixed s, the function A_y(x) = A(x, y) where A predicts r·s
# The Goldreich-Levin algorithm finds s by looking for heavy Fourier coefficients
from boofun.analysis.learning import goldreich_levin
# Create a "secret" s
n = 6
s = np.array([1, 0, 1, 1, 0, 1]) # Secret vector
# The inner product function χ_s(r) = (-1)^{r·s}
def inner_product_with_s(r):
return sum(r * s) % 2
chi_s = bf.create(inner_product_with_s, n=n)
# This is exactly χ_S in Fourier terms!
# The only heavy coefficient should be at S = {indices where s_i = 1}
# Direct API: f.fourier()
fourier = chi_s.fourier()
print(f"Secret s = {s}")
print(f"S = {{{', '.join(str(i) for i, v in enumerate(s) if v == 1)}}}")
# Find heavy Fourier coefficients
heavy = [(i, c) for i, c in enumerate(fourier) if abs(c) > 0.5]
print(f"\nHeavy Fourier coefficient at index: {heavy}")
# Use Goldreich-Levin to find heavy coefficients
print("\nUsing Goldreich-Levin algorithm:")
recovered = goldreich_levin(chi_s, threshold=0.1)
print(f"Recovered coefficient: {recovered}")
Secret s = [1 0 1 1 0 1]
S = {0, 2, 3, 5}
Heavy Fourier coefficient at index: [(45, 1.0)]
Using Goldreich-Levin algorithm:
Recovered coefficient: [(45, 1.0)]
Summary¶
Key Takeaways from HW2:¶
Majority is Unique: The only symmetric, monotone, odd voting rule
LTF Influences:
- Larger weight → larger influence
- Nassau County: Some voters have zero influence (dummy voters!)
- Not all functions are LTFs (XOR is NOT an LTF!)
Decision Tree Fourier:
- Depth $d$ → at most $4^d$ non-zero coefficients
- Size $s$ → $\sum_S |\hat{T}(S)| \leq s$
- Strong spectral concentration
Influence via Resampling:
- $\mathbf{Inf}_i[f] = 2 \cdot \Pr[f(X) \neq f(X^i)]$
- Leads to Poincaré inequality: $\mathbf{Var}[f] \leq \mathbf{I}[f]$
Goldreich-Levin:
- Heavy Fourier coefficients reveal secrets
- Connection to PRGs and cryptographic hardness
Using boofun:¶
import boofun as bf
from boofun.analysis import ltf_analysis
# Create LTF directly with library function
nassau = bf.weighted_majority([31, 31, 28, 21, 2, 2])
at_least_3 = bf.threshold(5, 3)
# Full LTF analysis
analysis = ltf_analysis.analyze_ltf(nassau)
print(analysis.summary())
# Find dummy voters
dummies = ltf_analysis.dummy_voters(nassau)
# Test if a function is an LTF
is_ltf = ltf_analysis.is_ltf(bf.parity(3)) # False!
# Chow parameters uniquely identify LTFs
chow = ltf_analysis.chow_parameters(nassau)
# Goldreich-Levin
from boofun.analysis.learning import goldreich_levin
heavy = goldreich_levin(f, threshold=0.1)