Skip to content

TurboQuant encoding for Vectors#7269

Open
connortsui20 wants to merge 11 commits intodevelopfrom
ct/turboquant
Open

TurboQuant encoding for Vectors#7269
connortsui20 wants to merge 11 commits intodevelopfrom
ct/turboquant

Conversation

@connortsui20
Copy link
Copy Markdown
Contributor

Continuation of #7167, authored by @lwwmanning

Summary

Lossy quantization for vector data (e.g., embeddings) based on TurboQuant (https://arxiv.org/abs/2504.19874). Supports both MSE-optimal and inner-product-optimal (Prod with QJL correction) variants at 1-8 bits per coordinate.

Key components:

  • Single TurboQuant array encoding with optional QJL correction fields, storing quantized codes, norms, centroids, and rotation signs as children.
  • Structured Random Hadamard Transform (SRHT) for O(d log d) rotation, fully self-contained with no external linear algebra library.
  • Max-Lloyd centroid computation on Beta(d/2, d/2) distribution.
  • Approximate cosine similarity and dot product compute directly on quantized arrays without full decompression.
  • Pluggable TurboQuantScheme for BtrBlocks, exposed via WriteStrategyBuilder::with_vector_quantization().
  • Benchmarks covering common embedding dimensions (128, 768, 1024, 1536).

Also refactors CompressingStrategy to a single constructor, and adds vortex_tensor::initialize() for session registration of tensor types, encodings, and scalar functions.

API Changes

Adds a new TurboQuant encoding + some other things. TODO

Testing

TODO

@connortsui20 connortsui20 added the changelog/feature A new feature label Apr 2, 2026
@connortsui20 connortsui20 force-pushed the ct/turboquant branch 11 times, most recently from 6d390d6 to 35ddb9f Compare April 3, 2026 18:13
lwwmanning and others added 9 commits April 3, 2026 17:20
Lossy quantization for vector data (e.g., embeddings) based on TurboQuant
(https://arxiv.org/abs/2504.19874). Supports both MSE-optimal and
inner-product-optimal (Prod with QJL correction) variants at 1-8 bits per
coordinate.

Key components:
- Single TurboQuant array encoding with optional QJL correction fields,
  storing quantized codes, norms, centroids, and rotation signs as children.
- Structured Random Hadamard Transform (SRHT) for O(d log d) rotation,
  fully self-contained with no external linear algebra library.
- Max-Lloyd centroid computation on Beta(d/2, d/2) distribution.
- Approximate cosine similarity and dot product compute directly on
  quantized arrays without full decompression.
- Pluggable TurboQuantScheme for BtrBlocks, exposed via
  WriteStrategyBuilder::with_vector_quantization().
- Benchmarks covering common embedding dimensions (128, 768, 1024, 1536).

Also refactors CompressingStrategy to a single constructor, and adds
vortex_tensor::initialize() for session registration of tensor types,
encodings, and scalar functions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Will Manning <will@willmanning.io>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
We are going to implement this later as a separate encoding (if we
decide to implement it at all because word on the street is that the
MSE + QJL is not actually better than MSE on its own).

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
It doesn't really make a lot of sense for us to define this as an
encoding for `FixedSizeList`.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
- Use ExecutionCtx in TurboQuant compress path and import ExecutionCtx
- Extend dtype imports with Nullability and PType to support extension
  types
- Wire in extension utilities: extension_element_ptype and
  extension_list_size for vector extensions
- Remove dimension and bit_width from slice/take compute calls to rely
  on metadata
- Update TurboQuant mod docs to mention VortexSessionExecute
- Change scheme.compress to use the provided compressor argument (not
  _compressor)
- Add an extensive TurboQuant test suite (roundtrip, MSE bounds, edge
  cases, f64 input, serde roundtrip, and dtype checks)
- Align vtable imports to new metadata handling (remove unused
  DeserializeMetadata/SerializeMetadata references)

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 3, 2026

Merging this PR will degrade performance by 29.05%

❌ 2 regressed benchmarks
✅ 1120 untouched benchmarks
⏩ 1530 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation take_map[(0.1, 1.0)] 1.6 ms 2.3 ms -29.05%
Simulation take_map[(0.1, 0.5)] 977.4 µs 1,209.5 µs -19.2%

Comparing ct/turboquant (77a4288) with develop (e3c7401)

Open in CodSpeed

Footnotes

  1. 1530 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@connortsui20 connortsui20 marked this pull request as ready for review April 3, 2026 23:03
@connortsui20
Copy link
Copy Markdown
Contributor Author

connortsui20 commented Apr 3, 2026

I would say this is ready for review now, but only with respect to the structure. I have yet to go through the implementation and make sure things make sense, but I have made it so the structure makes sense and we correctly handle the different floating point type inputs as well as null vectors.

I think it would be good to get a review now, and maybe we should just merge this and iterate later.

Copy link
Copy Markdown
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it

@connortsui20
Copy link
Copy Markdown
Contributor Author

I'll rebase and fix the errors tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants