perf: lower Newton balanced band ratio from 2/1 to 4/3#88
Merged
Conversation
The wraparound-Newton PRs (#85-#87) cut Newton's constant ~35-45%, which moved the generic BZ/Newton crossover in the near-balanced band from ratio ~2 down to ~1.3 (measured at nb 100k-200k limbs: Newton wins from 1.3-1.35 up; at ratio 1.5 BZ is 1.7-1.9x slower, at 1.95 up to 3.4x; 2^k+1-family divisor sizes are 68x worse on BZ at ratio 1.5). Lower NEWTON_BALANCED from 2/1 to 4/3. Exact-power-of-two divisors (BZ's best case) regress <= ~25% in the narrow (4/3, ~1.4) sliver - same accepted tradeoff as PR #79. Measured after the change (M1 Max, min of 3): - nb=160000 limbs ratio 1.5: 244 -> 165 ms - nb=100000 limbs ratio 1.4: 201 -> 126 ms - nb=131073 (2^17+1) ratio 1.5: 10.7 s -> 157 ms Known residual ratio in (1, 4/3) still routes to BZ: generic sizes are genuinely faster there, but 2^k+1-family sizes still blow up (1.1-5.4 s at nb=131073, ratios 1.05-1.25). The follow-up fix is quotient-sized division, which scales with the quotient instead of the divisor. 246 unit tests + div_correctness pass. Docs updated (DIVISION.md, CLAUDE.md). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Re-measure of the near-balanced division band after #85–#87 cut Newton's constant: the generic BZ/Newton crossover moved from ratio ~2 down to ~1.3. Lower
NEWTON_BALANCEDfrom 2/1 → 4/3 (sameb ≥ 98304limbs floor).Measurements (M1 Max, min of 3, paired)
Power-of-two divisors (BZ's best case) regress ≤ ~25% in the narrow (4/3, ~1.4) sliver — same accepted tradeoff as PR #79.
Known residual (documented)
ratio ∈ (1, 4/3) still routes to BZ: generic sizes are genuinely faster there (BZ wins below the crossover), but
2^k+1-family divisor sizes still hit the transform-doubling blowup (1.1–5.4 s at nb=131073, ratios 1.05–1.25). Follow-up: quotient-sized division (next PR).Testing
246/246 unit tests,
div_correctness(cross-checks +q·b + r == a,r < b).🤖 Generated with Claude Code