Skip to content

[spherical] Compression tweak and fix threadpool in benchmark#1137

Open
arkrishn94 wants to merge 1 commit into
mainfrom
u/adkrishnan/spherical-compress-perf
Open

[spherical] Compression tweak and fix threadpool in benchmark#1137
arkrishn94 wants to merge 1 commit into
mainfrom
u/adkrishnan/spherical-compress-perf

Conversation

@arkrishn94
Copy link
Copy Markdown
Contributor

@arkrishn94 arkrishn94 commented Jun 5, 2026

Some performance changes to maximize_cosine_similarity subroutine for spherical quantization (plus fix to the threadpool setup for exhasutive benchmark).

  • maximize_cosine_similarity: replace the min-heap with a flat buffer of all critical values that is sorted once and walked linearly. The visit order is identical to before. Also drop the sqrt computation using some simple algebra.
    The recall is identical to main for real datasets - 384 and 896 dimensional dense text embeddings.
  • Wrap Store::new in a rayon::ThreadPoolBuilder sized by the configured compression_threads, mirroring the exhaustive-product harness. Without this the benchmark always silently used the global pool.

Benchmarks

query_layouts = ["same_as_data"], transform_kind = "null", pre_scale = "none", recall@10/10.

(D = 384, N = 1M, 1,000 queries)

| NBITS | recall (%) | compress main | compress this PR | speedup |
| ---: | ---: | ---: | ---: | ---: |
| 1 | 46.33 |   0.14 s |   0.13 s | 1.08× |
| 2 | 72.22 |   2.27 s |   1.10 s | 2.06× |
| 4 | 91.20 |  12.53 s |   7.54 s | 1.66× |
| 8 | 99.32 | 230.67 s | 151.82 s | 1.52× |

(D = 896, N = 1M, 6,380 queries)

| NBITS | recall (%) | compress main | compress this PR | speedup |
| ---: | ---: | ---: | ---: | ---: |
| 1 | 50.45 |  0.25 s |  0.24 s | 1.04× |
| 2 | 72.19 |  4.48 s |  2.42 s | 1.85× |
| 4 | 89.32 | 27.52 s | 16.74 s | 1.64× |

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.39%. Comparing base (5f63807) to head (4306478).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1137      +/-   ##
==========================================
+ Coverage   89.36%   89.39%   +0.02%     
==========================================
  Files         484      485       +1     
  Lines       91384    92102     +718     
==========================================
+ Hits        81669    82335     +666     
- Misses       9715     9767      +52     
Flag Coverage Δ
miri 89.39% <100.00%> (+0.02%) ⬆️
unittests 89.04% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...kann-benchmark/src/backend/exhaustive/spherical.rs 100.00% <ø> (ø)
diskann-quantization/src/spherical/quantizer.rs 94.97% <100.00%> (-0.63%) ⬇️

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arkrishn94 arkrishn94 changed the title [spherical] Improve spherical compression performance and fix threadpool init in benchmark [spherical] Compression tweak and fix threadpool in benchmark Jun 5, 2026
@arkrishn94 arkrishn94 marked this pull request as ready for review June 5, 2026 21:10
@arkrishn94 arkrishn94 requested review from a team and Copilot June 5, 2026 21:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves spherical quantization compression performance by optimizing the maximize_cosine_similarity sweep (precomputing and sorting critical values once, and avoiding per-step sqrt) and fixes the exhaustive spherical benchmark so compression uses a rayon threadpool sized by compression_threads instead of silently using the global pool.

Changes:

  • Reworks maximize_cosine_similarity to precompute all critical values into a flat buffer, sort once, then iterate linearly; also compares cosine similarity using squared quantities to avoid sqrt.
  • Updates exhaustive spherical benchmark compression to run Store::new inside a dedicated Rayon threadpool sized by compression_threads.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
diskann-quantization/src/spherical/quantizer.rs Performance-focused rewrite of the critical-value sweep in maximize_cosine_similarity, including removing heap usage and avoiding sqrt.
diskann-benchmark/src/backend/exhaustive/spherical.rs Ensures compression uses a Rayon threadpool sized by compression_threads (mirrors the search harness pattern).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1015 to +1017
let stop: usize = 1usize << (num_bits.get() - 1);
let visits_per_dim: usize = stop.max(2) - 1;
let total = v.len() * visits_per_dim;
Comment on lines +1045 to +1052
// Sort critical values in ascending order so that walking the slice corresponds to
// sweeping `s` from `0` to `+inf`. `Pair`'s `Ord` impl is reversed so it's
// in ascending order.
crits.sort_unstable_by(|a, b| {
a.value
.partial_cmp(&b.value)
.unwrap_or(std::cmp::Ordering::Equal)
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants