BooFun Style Guide
Formatting is handled by CI. This is about how we write code.
Principles
Principle |
Meaning |
|---|---|
KISS |
Keep It Simple. No cleverness. |
DRY |
Don’t Repeat Yourself. One source of truth. |
Fail Loud |
Errors should scream, not whisper. |
Functional |
Pure functions. No mutation. |
Domain/Codomain |
Always document what goes in and out. |
Deterministic |
Same inputs → same outputs. Always. |
When Principles Conflict
Use this precedence:
Correctness → Determinism → Clarity (KISS) → DRY → Performance
Example: a cache improves performance and avoids recomputation (DRY), but if it makes behavior harder to reason about (KISS) or introduces nondeterminism, skip it.
KISS - Keep It Simple
# Good: Direct and obvious
def is_balanced(f):
return f.fourier()[0] == 0
# Bad: Over-engineered
class BalanceCheckerFactory:
def create_checker(self, strategy): ...
Rule: If a junior dev can’t understand it in 30 seconds, simplify.
DRY - Don’t Repeat Yourself
# Bad: Logic duplicated
def test_majority_3():
f = bf.majority(3)
assert sum(f.fourier()**2) == 1
def test_majority_5():
f = bf.majority(5)
assert sum(f.fourier()**2) == 1 # Same check!
# Good: One function, parameterized
@pytest.mark.parametrize("n", [3, 5, 7])
def test_parseval(n):
f = bf.majority(n)
assert np.isclose(sum(f.fourier()**2), 1)
Rule: If you copy-paste, you’re doing it wrong.
Fail Loud
# Good: Immediate, clear error
if f.n_vars != g.n_vars:
raise ValueError(f"Mismatched vars: {f.n_vars} vs {g.n_vars}")
# Bad: Silent "fix"
n = min(f.n_vars, g.n_vars) # Hides bug!
# Bad: Silent data loss
result = (values < 0).astype(bool) # Destroys magnitude!
Rule: Never silently coerce, truncate, or threshold.
assert vs raise
In library code, use explicit raise ValueError(...) for validation. Python’s -O flag strips assert statements, so assert is not a reliable guard in production.
In test code, use plain assert freely — pytest rewrites assertions for helpful error output.
# Good: library validation
if n <= 0:
raise ValueError(f"n must be positive, got {n}")
# Good: test assertion
assert np.isclose(result, expected), f"Got {result}, expected {expected}"
# Bad: library validation via assert (stripped by -O)
assert n > 0 # Disappears in optimized mode!
Exception chaining
When translating low-level exceptions into domain exceptions, always chain with from:
# Good: preserves causal context
try:
truth_table = f.get_representation("truth_table")
except KeyError as e:
raise ConversionError(f"Cannot get truth table for {f}") from e
# Bad: loses the original traceback
except KeyError:
raise ConversionError("Cannot get truth table")
Functional
# Good: Returns new object
def negate(f):
return bf.create([1-v for v in f.truth_table])
# Bad: Mutates input
def negate(f):
f.truth_table = [1-v for v in f.truth_table] # Side effect!
Rule: Same input → same output. No surprises.
Domain/Codomain
Every function documents its contract:
def convolution(f: BooleanFunction, g: BooleanFunction) -> np.ndarray:
"""
Fourier coefficients of f*g.
Domain: f, g with same n_vars
Codomain: array of reals (NOT a BooleanFunction!)
Raises: ValueError if n_vars mismatch
"""
Rule: Reader should know types without reading the code.
Rule: Public functions MUST have type hints on all parameters and return values.
# Good: types are part of the interface
def noise_stability(f: BooleanFunction, rho: float) -> float: ...
# Bad: reader has to guess
def noise_stability(f, rho): ...
Determinism
Monte Carlo results must be reproducible. Randomness is a controlled input, not an accident.
# Good: caller controls the seed
def estimate_influence(f, i, n_samples, rng=None):
rng = rng or np.random.default_rng()
...
# Good: notebook sets seed once at the top
np.random.seed(42)
# Bad: hidden nondeterminism
def estimate_influence(f, i, n_samples):
samples = np.random.randint(...) # Which seed? Unknown!
Rule: Any function that uses randomness SHOULD accept an rng parameter (or seed). Notebooks MUST set a seed at the top.
Bit-Ordering Convention
BooFun uses LSB = x₀ everywhere. This must be consistent across all code.
# Index i maps to bits via: x_j = (i >> j) & 1
# Index 5 = 0b101 means x₀=1, x₁=0, x₂=1
# Good: LSB-first
x = [(i >> j) & 1 for j in range(n)]
# Bad: MSB-first (common bug!)
x = [int(b) for b in format(i, f"0{n}b")] # WRONG ORDER
Rule: When building truth tables from index → bit vectors, always use (i >> j) & 1.
See CONTRIBUTING.md for the full specification.
Quick Test
Before committing, ask:
KISS: Can I explain this to a rubber duck?
DRY: Did I copy-paste anything?
Fail Loud: What happens with bad input?
Functional: Does this mutate anything?
Domain/Codomain: Are types obvious from the signature?
If any answer is wrong, fix it.
Documentation & Notebooks
Tone: Clear, mathematical, concise. Let examples speak for themselves.
# Bad: Self-promotion
"The POWER of the library: test ANY function!"
"This incredibly useful feature..."
# Good: Just show it
# Testing user-defined functions
def my_hash(x): ...
f = bf.create(my_hash, n=5)
tester.blr_linearity_test(f) # Works on any function
Rules:
No emojis in technical content
No “power of” / “incredibly” / “amazing” language
Quality is self-evident from examples, not claims
Prefer equations and code over prose
Keep notebooks short — one concept per section