From 992f9c1fe18ae66c9cc52040a34778e2f57972f3 Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Fri, 22 May 2026 14:48:41 +0800 Subject: [PATCH 1/7] RFC: Search Plan API for Disk Search Adds an RFC proposing a `SearchPlan` enum to replace the disk search API's `(vector_filter, is_flat_search)` parameter pair, and introducing beta-biased graph search as a new capability. The hierarchical `SearchPlan { FlatScan, Graph(GraphMode) }` shape makes invalid combinations unrepresentable and provides a single extension point for future graph algorithms. Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/00000-disk-beta-filter.md | 311 +++++++++++++++++++++++++++++++++ 1 file changed, 311 insertions(+) create mode 100644 rfcs/00000-disk-beta-filter.md diff --git a/rfcs/00000-disk-beta-filter.md b/rfcs/00000-disk-beta-filter.md new file mode 100644 index 000000000..fe65d7177 --- /dev/null +++ b/rfcs/00000-disk-beta-filter.md @@ -0,0 +1,311 @@ +# Search Plan API for Disk Search (with Beta-Biased Filtering) + +| | | +|------------------|--------------------------------| +| **Authors** | yaohongdeng | +| **Contributors** | | +| **Created** | 2026-05-21 | +| **Updated** | 2026-05-21 | + +## Summary + +Replace the disk search API's `(vector_filter: Option>, is_flat_search: bool)` parameter pair with a single `plan: SearchPlan` enum. The new enum is hierarchical (`SearchPlan { FlatScan, Graph(GraphMode) }`), makes invalid combinations unrepresentable, and introduces a new capability — *beta-biased graph search* — as one of its variants. The change also closes the design's only extension point for future graph algorithms (e.g. `MultihopSearch`) without further growing the public signature of `searcher.search()`. + +## Motivation + +### Background + +The disk search API today exposes filtering as a raw closure type alias paired with a separate boolean for flat vs. graph dispatch: + +```rust +// diskann-disk/src/build/configuration/filter_parameter.rs +pub type VectorFilter<'a, Data> = + Box::VectorIdType) -> bool + Send + Sync + 'a>; + +// searcher.search(..., vector_filter: Option, is_flat_search: bool) +``` + +The benchmark layer exposes both as independent inputs ([diskann-benchmark/src/inputs/disk.rs:83-85](../diskann-benchmark/src/inputs/disk.rs)). Today this is orthogonal — all four `(vector_filter, is_flat_search)` combinations correspond to valid disk-search configurations: + +| `vector_filter` | `is_flat_search` | Meaning | +|---|---|---| +| `None` | `false` | Graph search, no filter | +| `Some(p)` | `false` | Graph search + post-filter | +| `None` | `true` | Flat scan baseline (brute-force recall floor) | +| `Some(p)` | `true` | Flat scan + hard filter | + +Meanwhile, the in-memory side has gained a `BetaFilter` strategy ([diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) that biases beam traversal toward labelled vectors by multiplying their distances by a factor `β ∈ (0, 1]`. The disk path has no equivalent today: a query that wants beta-biased filtering on a disk index has no way to express it. + +### Problem Statement + +Three concrete problems with the current shape: + +1. **A raw closure can't carry `beta`.** Adding beta-biased graph search requires threading another parameter (`beta: Option`) through `search()` → `search_internal()` → `DiskAccessor`. The closure has no field to attach metadata to. + +2. **Beta breaks the existing orthogonality.** Beta is only meaningful in graph search (it biases beam expansion; flat scan has no beam). With three independent inputs `(vector_filter, is_flat_search, beta)`, 2 of the 2³ = 8 combinations are meaningless (`is_flat_search=true` with `beta=Some(_)`; `vector_filter=None` with `beta=Some(_)`). A flat `bool` + `Option` cannot reject them at compile time — validation has to run at runtime, and every call site has to know which combinations are valid. + +3. **No integration point for future graph algorithms.** `MultihopSearch` already exists in [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) but the disk API has no way to select it. Each new algorithm would mean another boolean flag and another runtime validation rule. + +### Goals + +1. Express all five valid configurations — flat-no-filter, flat-with-filter, graph-no-filter, graph-with-post-filter, graph-with-beta — as one named value each, with invalid combinations unrepresentable. +2. Add beta-biased disk graph search as a first-class capability without further growing `searcher.search()`'s parameter list. +3. Provide a single extension point for future graph algorithms that doesn't require changing the public `search()` signature. +4. Preserve the zero-allocation, zero-overhead property for the most common case (graph search, no filter). +5. Keep the disk filter API insulated from upstream changes to `diskann::graph::index::QueryLabelProvider`. + +## Proposal + +### Core types + +Two new types replace the existing `VectorFilter` alias: + +```rust +// diskann-disk/src/search/filter_parameter.rs +// (Moved from build/configuration/ — this is a search-time concept, not build-time.) + +pub type Predicate = Box bool + Send + Sync>; + +/// Top-level search plan: graph traversal vs. linear scan. +pub enum SearchPlan { + /// Brute-force linear scan. `Some(p)` applies `p` inline; `None` + /// scans every vector (recall baseline). + FlatScan { filter: Option }, + + /// Graph traversal; `GraphMode` picks the algorithm and any modifier. + Graph(GraphMode), +} + +/// Graph-search variant. Invalid combinations (e.g. beta without a predicate) +/// are unrepresentable by construction. +pub enum GraphMode { + /// Plain greedy beam. + Unfiltered, + + /// Greedy beam + hard post-filter (applied in `RerankAndFilter`). + /// Traversal identical to `Unfiltered`. + PostFilter(Predicate), + + /// Beta-biased beam: matching vectors' PQ distances multiplied by + /// `beta` ∈ (0, 1] in `pq_distances`. Predicate also post-filters. + BetaFilter { predicate: Predicate, beta: f32 }, + + // Future graph algorithms slot in here as new variants. e.g.: + // Multihop { predicate: Predicate }, +} + +impl GraphMode { + pub fn post_filter(predicate: F) -> Self + where F: Fn(u32) -> bool + Send + Sync + 'static { + Self::PostFilter(Box::new(predicate)) + } + + pub fn beta_filter(predicate: F, beta: f32) -> Self + where F: Fn(u32) -> bool + Send + Sync + 'static { + assert!(beta > 0.0 && beta <= 1.0, "beta must be in (0, 1]"); + Self::BetaFilter { predicate: Box::new(predicate), beta } + } +} + +impl SearchPlan { + pub fn flat() -> Self { Self::FlatScan { filter: None } } + + pub fn flat_filtered(predicate: F) -> Self + where F: Fn(u32) -> bool + Send + Sync + 'static { + Self::FlatScan { filter: Some(Box::new(predicate)) } + } + + pub fn graph() -> Self { Self::Graph(GraphMode::Unfiltered) } + pub fn graph_with(mode: GraphMode) -> Self { Self::Graph(mode) } +} +``` + +The five cases: + +| # | Case | Call | +|---|------|------| +| 1 | Flat scan, no filter | `SearchPlan::flat()` | +| 2 | Flat scan + filter | `SearchPlan::flat_filtered(\|id\| bm.contains(id))` | +| 3 | Graph, no filter | `SearchPlan::graph()` | +| 4 | Graph + post-filter | `SearchPlan::graph_with(GraphMode::post_filter(\|id\| bm.contains(id)))` | +| 5 | Graph + beta + post-filter (**new**) | `SearchPlan::graph_with(GraphMode::beta_filter(\|id\| bm.contains(id), 0.5))` | + +### Public `search()` signature change + +```rust +// Before +pub fn search( + query: &[Data::VectorDataType], + return_list_size: u32, + search_list_size: u32, + beam_width: Option, + vector_filter: Option>, + is_flat_search: bool, +) -> ANNResult> + +// After +pub fn search( + query: &[Data::VectorDataType], + return_list_size: u32, + search_list_size: u32, + beam_width: Option, + plan: SearchPlan, +) -> ANNResult> +``` + +### Internal plumbing + +| Type | Change | +|------|--------| +| `filter_parameter.rs` | Replace `VectorFilter` + `default_vector_filter()` with `Predicate`, `GraphMode`, `SearchPlan`. Move file from `build/configuration/` to `search/`. | +| `DiskSearchStrategy` | Old: `vector_filter: &'a dyn Fn`. New: `predicate: Option<&'a Predicate>` + `beta: Option`, projected from `plan` at construction. | +| `DiskAccessor` | Carries `predicate: Option<&'a Predicate>` + `beta: Option`. | +| `DiskAccessor::pq_distances` | ~4 added lines: apply `beta` when both fields are `Some`. | +| `RerankAndFilter` | Old: `filter: &'a dyn Fn`. New: `filter: Option<&'a Predicate>`. | +| `search_internal` | `(vector_filter, is_flat_search)` → `plan: &SearchPlan`; dispatches on top-level variant only. | +| `search_strategy` | The **only** site that introspects `GraphMode` (one exhaustive match producing `(predicate, beta)`). | +| `flat_search` path | Drop redundant `vector_filter` arg; read predicate from `plan`. | +| `expand_beam`, `distances_unordered` | **No change.** | + +The "project at the boundary" decision matters: every match on `GraphMode` happens in `search_strategy`. Downstream code (`DiskAccessor`, `RerankAndFilter`, `flat_search`) sees only the projected `(Option<&Predicate>, Option)` pair. Adding a new `GraphMode` variant later means touching one match arm, not three or four. + +### `pq_distances` — where beta applies + +```rust +// Before (disk_provider.rs:581-586) +for (i, id) in ids.iter().enumerate() { + let distance = self.scratch.pq_scratch.aligned_dist_scratch[i]; + f(distance, *id); +} + +// After +for (i, id) in ids.iter().enumerate() { + let mut distance = self.scratch.pq_scratch.aligned_dist_scratch[i]; + if let (Some(beta), Some(predicate)) = (self.beta, self.predicate) { + if predicate(*id) { + distance *= beta; + } + } + f(distance, *id); +} +``` + +When either field is `None`, the `if let` short-circuits — zero overhead on the common cases. + +### Caller migration + +| Today `(vector_filter, is_flat_search)` | New `plan` | +|---|---| +| `(None, false)` | `SearchPlan::graph()` | +| `(None, true)` | `SearchPlan::flat()` | +| `(Some(p), false)` | `SearchPlan::graph_with(GraphMode::post_filter(p))` | +| `(Some(p), true)` | `SearchPlan::flat_filtered(p)` | +| (not expressible) | `SearchPlan::graph_with(GraphMode::beta_filter(p, β))` | + +The benchmark layer constructs `SearchPlan` once at the boundary; the `--is_flat_search` flag is removed from the disk-index input schema. JSON input schemas that still carry `is_flat_search` are migrated as a separate config-only step. + +### Where beta does and doesn't apply + +| Location | Beta applied? | Rationale | +|----------|--------------|-----------| +| `pq_distances` | **Yes**, only on `GraphMode::BetaFilter` | Biases beam toward matching vectors during graph traversal | +| `ensure_loaded` (full-precision cache) | No | Cache must hold true distances for honest reranking | +| `RerankAndFilter::post_process` | No | Uses true distances; predicate is hard-filter only | +| `flat_search` path | No | Type system forbids it — `FlatScan` has no `GraphMode` field | + +### Extensibility — adding a new graph algorithm + +Two extension axes, both compiler-enforced: + +| Add… | Where it lives | `match` sites to update | +|------|-----|---| +| **New top-level search class** (e.g. range search) | New `SearchPlan` variant | `search_internal` dispatch | +| **New graph algorithm or beam modifier** | New `GraphMode` variant | `search_strategy` projection match (Rust exhaustiveness check forces the update) | + +Worked example — adding `Multihop` (which already exists in `diskann::graph::search::multihop_search` and takes `&dyn QueryLabelProvider`): + +```rust +// In GraphMode: +Multihop { predicate: Predicate }, +``` + +Three localized edits: + +1. Add the `GraphMode::Multihop { predicate }` arm to the projection match in `search_strategy`: `(Some(predicate), None)`. `RerankAndFilter` post-filters automatically; no beta. +2. Add a `ClosureAsLabelProvider` wrapper local to the disk crate, since `MultihopSearch::new` expects `&dyn QueryLabelProvider`: + + ```rust + struct ClosureAsLabelProvider<'a>(&'a Predicate); + + impl std::fmt::Debug for ClosureAsLabelProvider<'_> { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + f.debug_struct("ClosureAsLabelProvider").finish_non_exhaustive() + } + } + + impl QueryLabelProvider for ClosureAsLabelProvider<'_> { + fn is_match(&self, id: u32) -> bool { (self.0)(id) } + } + ``` + +3. Add a `match` arm in `search_internal` (or refine the `Graph(_)` arm) that calls `MultihopSearch::new(Knn, &adapter)` instead of plain `Knn`. + +The compiler refuses to build until every `match` on `GraphMode` handles `Multihop`. + +## Trade-offs + +### Closure (`Box bool>`) vs. `Arc>` trait + +**Chosen: closure.** Ergonomic at call sites (`|id| bm.contains(id)` vs. `Arc::new(BitmapProvider::new(bm))`). No coupling to a trait we don't own — upstream changes to `QueryLabelProvider` don't reach the disk path. No allocation on the no-filter path. The downside is that future algorithms requiring `&dyn QueryLabelProvider` need a thin adapter (~15 lines, scoped to the disk crate) — see the `Multihop` example above. + +**Alternative considered: `Arc + Send + Sync>`.** Would make `BetaFilter` *type-valid* immediately — same `Arc` shape the in-memory `BetaFilter::new()` already takes. Rejected because the underlying composability also requires moving beta application from `DiskAccessor::pq_distances` into the `QueryComputer` path, which is a separate (large) refactor. Type-valid composability is not behavioural composability, and the in-memory `QueryLabelProvider` will soon change anyway. Disk closures stay decoupled. + +### Hierarchical `SearchPlan { FlatScan, Graph(GraphMode) }` vs. flat `SearchPlan { FlatScanNoFilter, FlatScanFiltered, GraphNoFilter, GraphPostFilter, GraphBeta }` + +**Chosen: hierarchical.** Separates the graph-vs-linear-scan categorical break (different code paths) from the choice of graph algorithm/modifier. New graph algorithms (e.g. `Multihop`) slot into `GraphMode`, not into the top-level enum — they all traverse the graph and share dispatch. + +**Alternative considered: a flat enum.** One variant per case (5 today, more as algorithms are added). Slightly less nesting at call sites but no clear extension axis: a new graph algorithm has to pick a name pattern (`GraphMultihop`? `GraphMultihopFiltered`?). The hierarchy makes the "this is a graph algorithm" attribute explicit. + +### `Box` vs. `Arc` for `Predicate` + +**Chosen: `Box`.** Predicates are owned by `SearchPlan` for the duration of one search call; not shared across queries. `Box` is sufficient and avoids atomic refcount traffic. + +### `Fn(u32)` vs. `Fn(&u32)` + +**Chosen: `Fn(u32)`.** `u32` is 4 bytes and `Copy`; by-value is cheaper than an 8-byte reference, eliminates `*` derefs at every call site, and matches the codebase convention of `QueryLabelProvider::is_match(id: V)` ([diskann/src/graph/index.rs:82](../diskann/src/graph/index.rs)). + +### Project at strategy construction vs. defer projection + +**Chosen: project at strategy construction.** `search_strategy` is the single match site for `GraphMode`; the strategy carries `(predicate, beta)` as fields, and every downstream consumer reads those fields directly. Adding a `GraphMode` variant requires updating one `match`, not three. + +**Alternative considered: keep `&SearchPlan` in the strategy, project later.** Downstream consumers (`DiskAccessor`, `RerankAndFilter`, `flat_search` path) each re-match. More flexible but each new `GraphMode` variant touches more sites. + +### `assert!` in `GraphMode::beta_filter` vs. `Result` + +**Chosen: `assert!`.** Caller-side bug, not a runtime condition. Returning `Result` would force every caller to handle an error that, if it ever fires, indicates a programming mistake (passing literal `0.0` or `2.0` as `beta`). The constructor panicking on the calling thread is acceptable. + +## Benchmark Results + +Not yet collected. The primary risk is whether beta-biased graph search delivers a measurable recall improvement over post-filter on filter-selective workloads, and at what `β` the cost crossover sits. Planned experiments: + +1. **Recall vs. effort sweep** — case 4 (`PostFilter`) vs. case 5 (`BetaFilter`) across `β ∈ {0.3, 0.5, 0.7, 0.9}` on workloads with selectivity ∈ {1%, 10%, 50%}. Establishes when beta is worth its traversal cost. +2. **No-filter zero-cost regression** — counter assertion that case 3 (`SearchPlan::graph()`) invokes no closure in `pq_distances` or `RerankAndFilter`. +3. **`is_flat_search=true, vector_filter=None` baseline parity** — confirm `SearchPlan::flat()` matches today's `(None, true)` performance and recall exactly. + +Results will be added to this RFC before merge. + +## Future Work + +- [ ] **`BetaFilter` composability.** Move beta application from `DiskAccessor::pq_distances` into the `QueryComputer` path, then add an `IdFilter → QueryLabelProvider` adapter so the disk strategy can be wrapped by the in-memory `BetaFilter` strategy. Out of scope here. +- [ ] **`Multihop` integration on the disk path.** Add as a new `GraphMode` variant per the worked example in §Extensibility. +- [ ] **`u64` `VectorIdType` support.** `Predicate` pins `u32` today, matching the disk path's existing `Data::VectorIdType = u32` constraint at [disk_provider.rs:548-557](../diskann-disk/src/search/provider/disk_provider.rs). A future `u64` ID type would require generalizing `Predicate` and the projection. +- [ ] **Migration of benchmark JSON input schemas.** Remove `is_flat_search` from the disk-index schema and from the example/perf-test JSON files. Config-only, no code dependency after the backend update. + +## References + +1. [docs/disk-beta-filter-with-query-label-provider.md](../docs/disk-beta-filter-with-query-label-provider.md) — full design doc this RFC summarizes. +2. [docs/disk-beta-filter.md](../docs/disk-beta-filter.md) — earlier closure-based beta filter design (superseded by this RFC). +3. [diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs) — in-memory `BetaFilter` strategy; the disk path implements the algorithm independently. +4. [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) — `MultihopSearch`; the worked extensibility example. +5. [diskann-disk/src/search/provider/disk_provider.rs](../diskann-disk/src/search/provider/disk_provider.rs) — current `pq_distances`, `RerankAndFilter`, `DiskSearchStrategy`, `DiskAccessor`. +6. [diskann-benchmark/src/backend/disk_index/search.rs](../diskann-benchmark/src/backend/disk_index/search.rs) — current benchmark call site (lines 267-281). From 123a59203626e5653f196917c3dede0b776b5afd Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Fri, 22 May 2026 15:19:05 +0800 Subject: [PATCH 2/7] RFC: restructure per template; add behavioral-divergence future work - Reframe Motivation around supporting BetaFilter on the disk path and designing an extension point for future filter algorithms. - Move the full design doc content under the Proposal section. - Set Benchmark Results, Future Work, References to N/A initially, then add one Future Work item: align disk and in-memory BetaFilter semantics (disk applies bias + post-filter; in-memory only biases). - Drop the Multihop worked example and its back-references. Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/00000-disk-beta-filter.md | 427 +++++++++++++++++++++++++-------- 1 file changed, 327 insertions(+), 100 deletions(-) diff --git a/rfcs/00000-disk-beta-filter.md b/rfcs/00000-disk-beta-filter.md index fe65d7177..ce0e7434f 100644 --- a/rfcs/00000-disk-beta-filter.md +++ b/rfcs/00000-disk-beta-filter.md @@ -9,13 +9,15 @@ ## Summary -Replace the disk search API's `(vector_filter: Option>, is_flat_search: bool)` parameter pair with a single `plan: SearchPlan` enum. The new enum is hierarchical (`SearchPlan { FlatScan, Graph(GraphMode) }`), makes invalid combinations unrepresentable, and introduces a new capability — *beta-biased graph search* — as one of its variants. The change also closes the design's only extension point for future graph algorithms (e.g. `MultihopSearch`) without further growing the public signature of `searcher.search()`. +Adds support for **beta-biased filtering on the disk search path** by introducing a `SearchPlan` enum that replaces the existing `(vector_filter, is_flat_search)` parameter pair on `searcher.search()`. The new enum is hierarchical (`SearchPlan { FlatScan, Graph(GraphMode) }`), encodes today's four search configurations plus the new beta-biased variant as one named value each, makes invalid combinations unrepresentable, and gives future filter algorithms a single extension point that doesn't require further growing the public `search()` signature. ## Motivation ### Background -The disk search API today exposes filtering as a raw closure type alias paired with a separate boolean for flat vs. graph dispatch: +The in-memory side of DiskANN has a `BetaFilter` strategy ([diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) that biases beam traversal toward vectors matching a label predicate, by multiplying their distances by a factor `β ∈ (0, 1]`. It's documented and tested for the in-memory `FullPrecision` strategy. + +The disk search path has no equivalent. The disk API today exposes filtering as a raw closure type alias paired with a separate boolean for flat vs. graph dispatch: ```rust // diskann-disk/src/build/configuration/filter_parameter.rs @@ -25,47 +27,83 @@ pub type VectorFilter<'a, Data> = // searcher.search(..., vector_filter: Option, is_flat_search: bool) ``` -The benchmark layer exposes both as independent inputs ([diskann-benchmark/src/inputs/disk.rs:83-85](../diskann-benchmark/src/inputs/disk.rs)). Today this is orthogonal — all four `(vector_filter, is_flat_search)` combinations correspond to valid disk-search configurations: - -| `vector_filter` | `is_flat_search` | Meaning | -|---|---|---| -| `None` | `false` | Graph search, no filter | -| `Some(p)` | `false` | Graph search + post-filter | -| `None` | `true` | Flat scan baseline (brute-force recall floor) | -| `Some(p)` | `true` | Flat scan + hard filter | +A raw closure has no field to attach `β` to, and bolting beta on as a third `Option` parameter would create meaningless combinations (e.g. `is_flat_search=true` + `beta=Some(_)`) that the type system can't reject. -Meanwhile, the in-memory side has gained a `BetaFilter` strategy ([diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) that biases beam traversal toward labelled vectors by multiplying their distances by a factor `β ∈ (0, 1]`. The disk path has no equivalent today: a query that wants beta-biased filtering on a disk index has no way to express it. +Looking forward, `MultihopSearch` already exists in [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) but the disk API has no way to select it. Each new graph algorithm under the current shape would mean another boolean flag and another runtime validation rule. ### Problem Statement -Three concrete problems with the current shape: +1. **Beta-biased graph search is unavailable on the disk path.** Queries that would benefit from biasing beam traversal toward labelled vectors must fall back to plain post-filtering on the disk side, with no way to express the beta variant. -1. **A raw closure can't carry `beta`.** Adding beta-biased graph search requires threading another parameter (`beta: Option`) through `search()` → `search_internal()` → `DiskAccessor`. The closure has no field to attach metadata to. +2. **The current API shape can't carry beta cleanly.** A raw closure has no place to attach `β`. Threading `beta: Option` through `search()` → `search_internal()` → `DiskAccessor` as a fourth parameter creates orthogonality problems: beta is only meaningful in graph search, and `(vector_filter, is_flat_search, beta)` as three independent inputs admits 2 of 8 meaningless combinations the type system can't catch. -2. **Beta breaks the existing orthogonality.** Beta is only meaningful in graph search (it biases beam expansion; flat scan has no beam). With three independent inputs `(vector_filter, is_flat_search, beta)`, 2 of the 2³ = 8 combinations are meaningless (`is_flat_search=true` with `beta=Some(_)`; `vector_filter=None` with `beta=Some(_)`). A flat `bool` + `Option` cannot reject them at compile time — validation has to run at runtime, and every call site has to know which combinations are valid. - -3. **No integration point for future graph algorithms.** `MultihopSearch` already exists in [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) but the disk API has no way to select it. Each new algorithm would mean another boolean flag and another runtime validation rule. +3. **No clean integration point for future filter algorithms.** `MultihopSearch` exists in `diskann` but the disk API has no extension point — adding it (or any future algorithm) would mean another flag and more cross-field validation. ### Goals -1. Express all five valid configurations — flat-no-filter, flat-with-filter, graph-no-filter, graph-with-post-filter, graph-with-beta — as one named value each, with invalid combinations unrepresentable. -2. Add beta-biased disk graph search as a first-class capability without further growing `searcher.search()`'s parameter list. -3. Provide a single extension point for future graph algorithms that doesn't require changing the public `search()` signature. -4. Preserve the zero-allocation, zero-overhead property for the most common case (graph search, no filter). -5. Keep the disk filter API insulated from upstream changes to `diskann::graph::index::QueryLabelProvider`. +1. Add beta-biased disk graph search as a first-class capability — equivalent in effect to the in-memory `BetaFilter` strategy. +2. Replace the `(vector_filter, is_flat_search)` parameter pair with a single value where every supported configuration is one named variant or constructor. +3. Make invalid combinations (beta on flat scan, beta without a predicate) unrepresentable by construction — caught at compile time, not by runtime assertion. +4. Provide a single extension point so future filter algorithms (`MultihopSearch`, others) can be added without changing the public `search()` signature. +5. Preserve zero allocation and zero per-iteration overhead on the no-filter graph path (today's most common case). ## Proposal -### Core types +### 1. Motivation + +The disk search filter is a raw `Box` type alias paired with a separate `is_flat_search: bool` flag: + +```rust +// diskann-disk/src/build/configuration/filter_parameter.rs +pub type VectorFilter<'a, Data> = + Box::VectorIdType) -> bool + Send + Sync + 'a>; + +// searcher.search(..., vector_filter: Option, is_flat_search: bool) +``` + +Three problems with this shape: + +1. **A raw closure can't carry `beta`.** Adding beta-biased graph search requires threading another parameter through `search()` → `search_internal()` → `DiskAccessor`. -Two new types replace the existing `VectorFilter` alias: +2. **Adding beta breaks the existing orthogonality between `is_flat_search` and `vector_filter`.** Today the two flags are genuinely orthogonal: all four `(is_flat_search, vector_filter)` combinations are meaningful, and the benchmark exposes them as independent inputs ([diskann-benchmark/src/inputs/disk.rs:83-85](../diskann-benchmark/src/inputs/disk.rs)). But beta only exists in graph search — it biases beam expansion, and a flat scan has no beam. Adding beta as a third independent parameter creates meaningless combinations (e.g. `is_flat_search=true` + `beta=Some(_)`) that the type system can't reject. + +3. **No integration point for future graph algorithms.** `MultihopSearch` already exists in `diskann/src/graph/search/multihop_search.rs`, but the disk API has no way to select it. Each new algorithm would mean another boolean flag. + +This design replaces both flags with a single `plan: SearchPlan` enum. Each of the five supported configurations is one explicit variant or constructor; future graph algorithms slot in as new `GraphMode` variants without changing the public `search()` signature; the compiler enforces exhaustiveness at every dispatch site. + +### 2. API Change + +#### 2.1 The `Predicate` type alias + +The disk path identifies filtered IDs through a closure, not a trait. There is no `IdFilter` trait, no `BitmapProvider` wrapper, and no `QueryLabelProvider` dependency. ```rust // diskann-disk/src/search/filter_parameter.rs // (Moved from build/configuration/ — this is a search-time concept, not build-time.) pub type Predicate = Box bool + Send + Sync>; +``` + +**Why a closure, not a trait?** +- **Ergonomic at call sites.** `|id| bm.contains(id)` is shorter and clearer than `Arc::new(BitmapProvider::new(bm))`. Callers using `HashSet`, sorted `Vec`, or any custom backing structure write a closure directly. +- **No upstream coupling.** No dependency on `diskann::graph::index::QueryLabelProvider` — upstream signature changes don't reach the disk path. If a future graph algorithm (e.g. `MultihopSearch`) requires `&dyn QueryLabelProvider`, it adapts at the boundary with a thin local wrapper. +- **No allocation when absent.** `Option` is `None` for the no-filter cases; no closure object is constructed. + +**Why `Fn(u32)` and not `Fn(&u32)`?** `u32` is 4 bytes and `Copy`; passing by value is cheaper than an 8-byte reference, eliminates `*` derefs at every call site, and matches the codebase convention of `QueryLabelProvider::is_match(id: V)` ([diskann/src/graph/index.rs:82](../diskann/src/graph/index.rs)). + +**Why `Box`, not `Arc`?** Predicates are owned by `SearchPlan` for the duration of one search call. They aren't shared across queries. `Box` is sufficient and avoids atomic refcount traffic on every clone/drop. + +#### 2.2 `SearchPlan` and `GraphMode` enums + +**Current** (`filter_parameter.rs`): +```rust +pub type VectorFilter<'a, Data> = + Box::VectorIdType) -> bool + Send + Sync + 'a>; +``` + +**Proposed**: replace the type alias with two hierarchical enums. `SearchPlan` makes the top-level graph-vs-flat-scan break; `GraphMode` describes the variant on the graph path. +```rust /// Top-level search plan: graph traversal vs. linear scan. pub enum SearchPlan { /// Brute-force linear scan. `Some(p)` applies `p` inline; `None` @@ -77,7 +115,7 @@ pub enum SearchPlan { } /// Graph-search variant. Invalid combinations (e.g. beta without a predicate) -/// are unrepresentable by construction. +/// are unrepresentable. pub enum GraphMode { /// Plain greedy beam. Unfiltered, @@ -120,17 +158,25 @@ impl SearchPlan { } ``` -The five cases: +**The five cases:** -| # | Case | Call | -|---|------|------| +| # | Case | `SearchPlan` value | +|---|------|---| | 1 | Flat scan, no filter | `SearchPlan::flat()` | | 2 | Flat scan + filter | `SearchPlan::flat_filtered(\|id\| bm.contains(id))` | | 3 | Graph, no filter | `SearchPlan::graph()` | | 4 | Graph + post-filter | `SearchPlan::graph_with(GraphMode::post_filter(\|id\| bm.contains(id)))` | -| 5 | Graph + beta + post-filter (**new**) | `SearchPlan::graph_with(GraphMode::beta_filter(\|id\| bm.contains(id), 0.5))` | +| 5 | Graph + beta + post-filter | `SearchPlan::graph_with(GraphMode::beta_filter(\|id\| bm.contains(id), 0.5))` | + +**Key design decisions**: -### Public `search()` signature change +- **Hierarchical, not flat.** `SearchPlan { FlatScan, Graph(GraphMode) }` separates the graph-vs-linear-scan categorical break (different access patterns, different code paths) from the choice of graph algorithm/modifier. A future `Multihop` slots into `GraphMode` as a sibling to `BetaFilter`, not as a top-level variant — they both traverse the graph. +- **Invalid states unrepresentable.** `BetaFilter` carries the predicate inline; the `FlatScan` path has no `GraphMode` field. Beta-without-predicate and beta-on-flat-scan are rejected at compile time, not by runtime assertion. +- **Project at the boundary.** `GraphMode` exposes constructors only — no `predicate()` or `beta()` accessors. The strategy projects `(Option<&Predicate>, Option)` from the plan via one exhaustive match at construction (§4.2). Blocks semantically meaningless calls (asking `beta()` on `Unfiltered`) and routes every new variant through a single site, compiler-enforced. + +#### 2.3 `search()` public signature + +`is_flat_search: bool` and `vector_filter: Option` are **both removed** and replaced with a single `plan: SearchPlan` parameter. ```rust // Before @@ -139,7 +185,7 @@ pub fn search( return_list_size: u32, search_list_size: u32, beam_width: Option, - vector_filter: Option>, + vector_filter: Option>, // Box is_flat_search: bool, ) -> ANNResult> @@ -153,32 +199,142 @@ pub fn search( ) -> ANNResult> ``` -### Internal plumbing +Dispatch is a single match on `plan` (see §4.3). `SearchPlan::graph()` is the default "graph, no filter" case — a unit variant inside a unit variant, no allocation. + +#### 2.4 Caller migration + +The benchmark today exposes `is_flat_search` and `vector_filters_file` as independent flags ([diskann-benchmark/src/backend/disk_index/search.rs:267-281](../diskann-benchmark/src/backend/disk_index/search.rs)). The four combinations map onto `SearchPlan` as follows: + +| Today `(vector_filter, is_flat_search)` | New `plan` | +|---|---| +| `(None, false)` | `SearchPlan::graph()` | +| `(None, true)` | `SearchPlan::flat()` | +| `(Some(p), false)` | `SearchPlan::graph_with(GraphMode::post_filter(p))` | +| `(Some(p), true)` | `SearchPlan::flat_filtered(p)` | +| (not expressible) | `SearchPlan::graph_with(GraphMode::beta_filter(p, β))` | + +The last row is **new** capability not available today. + +**Example: flat scan + hard filter migration** + +```rust +// Before — raw closure + separate boolean +let filter_list: Arc = /* bitmap of allowed IDs */; +let result = searcher.search( + query, + return_list_size, + search_list_size, + beam_width, + Some(Box::new(move |vid: &u32| filter_list.contains(*vid))), + true, // is_flat_search +)?; + +// After — single plan parameter, no boolean +let filter_list: Arc = /* bitmap of allowed IDs */; +let result = searcher.search( + query, + return_list_size, + search_list_size, + beam_width, + SearchPlan::flat_filtered(move |id| filter_list.contains(id)), +)?; +``` + +**Example: graph + post-filter migration** + +```rust +let filter_list: Arc = /* bitmap of allowed IDs */; +let result = searcher.search( + query, + return_list_size, + search_list_size, + beam_width, + SearchPlan::graph_with(GraphMode::post_filter(move |id| filter_list.contains(id))), +)?; +``` + +**Example: beta-biased graph search (new capability)** + +```rust +let active_ids: Arc = /* bitmap of non-deleted IDs */; +let result = searcher.search( + query, + return_list_size, + search_list_size, + beam_width, + SearchPlan::graph_with(GraphMode::beta_filter( + move |id| active_ids.contains(id), + 0.5, // beta: bias beam toward matching vectors + )), +)?; +``` + +**Benchmark CLI**: the `--is_flat_search` flag is removed from the disk-index benchmark input schema. The benchmark layer constructs `SearchPlan` from `(is_flat_search, vector_filters_file)` once at the boundary and passes only `SearchPlan` to `searcher.search()`. JSON input schemas that still carry `is_flat_search` should be migrated as a separate task (config-only, no code dependency after the backend update). + +### 3. Internal Plumbing + +**Changes per type:** | Type | Change | |------|--------| | `filter_parameter.rs` | Replace `VectorFilter` + `default_vector_filter()` with `Predicate`, `GraphMode`, `SearchPlan`. Move file from `build/configuration/` to `search/`. | -| `DiskSearchStrategy` | Old: `vector_filter: &'a dyn Fn`. New: `predicate: Option<&'a Predicate>` + `beta: Option`, projected from `plan` at construction. | -| `DiskAccessor` | Carries `predicate: Option<&'a Predicate>` + `beta: Option`. | -| `DiskAccessor::pq_distances` | ~4 added lines: apply `beta` when both fields are `Some`. | -| `RerankAndFilter` | Old: `filter: &'a dyn Fn`. New: `filter: Option<&'a Predicate>`. | -| `search_internal` | `(vector_filter, is_flat_search)` → `plan: &SearchPlan`; dispatches on top-level variant only. | -| `search_strategy` | The **only** site that introspects `GraphMode` (one exhaustive match producing `(predicate, beta)`). | -| `flat_search` path | Drop redundant `vector_filter` arg; read predicate from `plan`. | -| `expand_beam`, `distances_unordered` | **No change.** | +| `DiskSearchStrategy` | `vector_filter: &'a dyn Fn` → carries `predicate: Option<&'a Predicate>` + `beta: Option`, projected from `plan` at construction | +| `DiskAccessor` | Carries `predicate: Option<&'a Predicate>` + `beta: Option` (both extracted from `plan` at construction) | +| `DiskAccessor::pq_distances` | ~4 lines: apply `beta` when both fields are `Some` | +| `DiskAccessor::new` | Accept `predicate` + `beta` (extracted by `search_accessor()`) | +| `RerankAndFilter` | `filter: &'a dyn Fn` → `filter: Option<&'a Predicate>` | +| `search_internal` | `(vector_filter, is_flat_search)` → `plan: &SearchPlan`; dispatches on top-level variant only | +| `search_strategy` | The **only** site that introspects `GraphMode`; produces `(predicate, beta)` for the strategy fields | +| `flat_search` path | Drop redundant `vector_filter` arg; read predicate from `plan` | +| `search()` | Replace `vector_filter` + `is_flat_search` with `plan: SearchPlan` | +| `expand_beam` | **No change** | +| `distances_unordered` | **No change** | + +**No `Arc`, no allocation on the no-filter paths.** `Predicate` is `Box`; `DiskAccessor::predicate` is `Option<&'a Predicate>`. For `SearchPlan::graph()` (i.e. `GraphMode::Unfiltered`) and `SearchPlan::flat()`, both `predicate` and `beta` are `None` — no `Box` is constructed. + +### 4. Core Implementation Changes + +#### 4.1 `search()` — single dispatch parameter -The "project at the boundary" decision matters: every match on `GraphMode` happens in `search_strategy`. Downstream code (`DiskAccessor`, `RerankAndFilter`, `flat_search`) sees only the projected `(Option<&Predicate>, Option)` pair. Adding a new `GraphMode` variant later means touching one match arm, not three or four. +```rust +pub fn search(&self, ..., plan: SearchPlan) -> ANNResult> { + self.search_internal(query, ..., &plan) +} +``` + +#### 4.2 `search_strategy` projection — the single `GraphMode` match site + +`search_strategy` is the **only** place that introspects `GraphMode`'s variants. One exhaustive match produces `(predicate, beta)`; the strategy carries them as fields. Every downstream consumer reads `strategy.predicate` and `strategy.beta` — no further variant matching anywhere. + +```rust +fn search_strategy<'a>( + &'a self, + query: &'a [Data::VectorDataType], + plan: &'a SearchPlan, +) -> DiskSearchStrategy<'a, ...> { + let (predicate, beta) = match plan { + SearchPlan::FlatScan { filter } => (filter.as_ref(), None), + SearchPlan::Graph(GraphMode::Unfiltered) => (None, None), + SearchPlan::Graph(GraphMode::PostFilter(p)) => (Some(p), None), + SearchPlan::Graph(GraphMode::BetaFilter { predicate, beta }) => + (Some(predicate), Some(*beta)), + }; + DiskSearchStrategy { predicate, beta, query, ... } +} +``` -### `pq_distances` — where beta applies +#### 4.3 `pq_distances()` — apply `beta` when both fields are `Some` +**Current** ([disk_provider.rs:581-586](../diskann-disk/src/search/provider/disk_provider.rs)): ```rust -// Before (disk_provider.rs:581-586) for (i, id) in ids.iter().enumerate() { let distance = self.scratch.pq_scratch.aligned_dist_scratch[i]; f(distance, *id); } +``` -// After +**Proposed**: +```rust for (i, id) in ids.iter().enumerate() { let mut distance = self.scratch.pq_scratch.aligned_dist_scratch[i]; if let (Some(beta), Some(predicate)) = (self.beta, self.predicate) { @@ -190,67 +346,152 @@ for (i, id) in ids.iter().enumerate() { } ``` -When either field is `None`, the `if let` short-circuits — zero overhead on the common cases. +When `beta` or `predicate` is `None`, the `if let` short-circuits — zero overhead on cases 3 (`Unfiltered`) and 4 (`PostFilter`). -### Caller migration +#### 4.4 `search_internal()` — dispatch by top-level variant -| Today `(vector_filter, is_flat_search)` | New `plan` | -|---|---| -| `(None, false)` | `SearchPlan::graph()` | -| `(None, true)` | `SearchPlan::flat()` | -| `(Some(p), false)` | `SearchPlan::graph_with(GraphMode::post_filter(p))` | -| `(Some(p), true)` | `SearchPlan::flat_filtered(p)` | -| (not expressible) | `SearchPlan::graph_with(GraphMode::beta_filter(p, β))` | +`search_internal` dispatches by the top-level `SearchPlan` variant only — the `GraphMode` match already happened in `search_strategy`. + +```rust +pub(crate) fn search_internal( + &self, ..., plan: &SearchPlan, +) -> ANNResult { + let strategy = self.search_strategy(query, plan); + let stats = match plan { + SearchPlan::FlatScan { filter } => { + let accept_all: &(dyn Fn(u32) -> bool + Send + Sync) = &|_| true; + let accept: &(dyn Fn(u32) -> bool + Send + Sync) = match filter { + Some(p) => &**p, + None => accept_all, + }; + self.runtime.block_on(self.index.flat_search( + &strategy, &DefaultContext, strategy.query, accept, + &Knn::new(k, l, beam_width)?, &mut result_output_buffer, + ))? + } + SearchPlan::Graph(_) => { + let knn = Knn::new(k, l, beam_width)?; + self.runtime.block_on(self.index.search( + knn, &strategy, &DefaultContext, strategy.query, &mut result_output_buffer, + ))? + } + }; + ... +} +``` + +The `Graph(_)` arm doesn't need to look at the `GraphMode` — `pq_distances` already knows about `beta` via `strategy`, and `RerankAndFilter` already knows about `predicate` via `strategy`. + +#### 4.5 `RerankAndFilter` — optional predicate + +```rust +pub struct RerankAndFilter<'a> { + filter: Option<&'a Predicate>, +} -The benchmark layer constructs `SearchPlan` once at the boundary; the `--is_flat_search` flag is removed from the disk-index input schema. JSON input schemas that still carry `is_flat_search` are migrated as a separate config-only step. +// in DiskSearchStrategy::default_post_processor(): +fn default_post_processor(&self) -> RerankAndFilter<'_> { + RerankAndFilter::new(self.predicate) +} -### Where beta does and doesn't apply +// in RerankAndFilter::post_process(): +.filter(|id| self.filter.map_or(true, |f| f(*id))) +``` + +`map_or(true, ...)` short-circuits with no closure invocation when `filter` is `None`. + +### 5. Where Beta Is / Isn't Applied | Location | Beta applied? | Rationale | |----------|--------------|-----------| -| `pq_distances` | **Yes**, only on `GraphMode::BetaFilter` | Biases beam toward matching vectors during graph traversal | -| `ensure_loaded` (full-precision cache) | No | Cache must hold true distances for honest reranking | -| `RerankAndFilter::post_process` | No | Uses true distances; predicate is hard-filter only | +| `pq_distances()` | **Yes**, only on `GraphMode::BetaFilter` | Biases beam toward matching vectors during graph traversal | +| `ensure_loaded()` (full-precision cache) | No | Cache must hold true distances for honest reranking | +| `RerankAndFilter::post_process()` | No | Uses true distances; predicate is hard-filter only | | `flat_search` path | No | Type system forbids it — `FlatScan` has no `GraphMode` field | -### Extensibility — adding a new graph algorithm +### 6. Data Flow + +``` +search(plan: SearchPlan) + ▼ +search_internal(plan: &SearchPlan) + │ + ├─► step 1: strategy = search_strategy(query, plan) + │ // exhaustive match on GraphMode projects (predicate, beta) once; + │ // strategy carries the projection — nothing downstream reads GraphMode + │ + └─► step 2: dispatch on top-level variant + │ + ├─[FlatScan { filter }]─► index.flat_search(..., accept-or-accept-all, ...) + │ (no GraphMode on this path; type system forbids beta) + │ + └─[Graph(_)]─► index.search(Knn, &strategy, ctx, query, output) + // index.search drives the traversal and calls back into the + // strategy on demand: + │ + ├─► strategy.search_accessor(...) → DiskAccessor + │ reads strategy.predicate, strategy.beta + │ │ + │ └─► accessor.pq_distances() — invoked per beam expansion + │ if let (Some(beta), Some(p)) = (self.beta, self.predicate) { + │ if p(*id) { distance *= beta } + │ } + │ + └─► strategy.default_post_processor() → RerankAndFilter + // invoked once on the final candidate set + filter = strategy.predicate +``` + +### 7. Files Modified + +| File | Scope of change | +|------|-----------------| +| `diskann-disk/src/search/filter_parameter.rs` (moved from `build/configuration/`) | New types: `Predicate`, `GraphMode`, `SearchPlan` replace `VectorFilter` and `default_vector_filter()` | +| `diskann-disk/src/search/provider/disk_provider.rs` | `DiskSearchStrategy`, `DiskAccessor`, `pq_distances`, `RerankAndFilter`, `search_internal`, `search_strategy`, `flat_search`, `search()`, internal test call sites | +| `diskann-benchmark/src/backend/disk_index/search.rs` | Build `SearchPlan` at the boundary; drop `is_flat_search` parameter | +| `tools/search_disk_index.rs` (or equivalent) | Same closure-to-`SearchPlan` replacement | +| Benchmark input JSON schemas | Remove `is_flat_search`; encode the plan in inputs instead (config-only, no code dependency after the backend update) | + +### 8. Backward Compatibility + +- **Existing no-filter callers**: replace `vector_filter: None, is_flat_search: false` with `SearchPlan::graph()` — identical behavior, no allocation. +- **Existing flat-scan-no-filter callers** (benchmark recall baseline): replace `vector_filter: None, is_flat_search: true` with `SearchPlan::flat()`. +- **Existing filtered callers (graph + post-filter)**: replace `Some(Box::new(|vid| bm.contains(vid))), is_flat_search: false` with `SearchPlan::graph_with(GraphMode::post_filter(move |id| bm.contains(id)))`. +- **Existing flat-scan-with-filter callers**: replace `Some(Box::new(|vid| bm.contains(vid))), is_flat_search: true` with `SearchPlan::flat_filtered(move |id| bm.contains(id))`. +- **`search_internal` is `pub(crate)`** — no external API breakage beyond `search()`. +- **Zero overhead on the no-filter graph path** — `SearchPlan::graph()` constructs no `Box`; `pq_distances`' `if let` does not match; `RerankAndFilter::filter` is `None` and the `.filter` call short-circuits. + +### 9. Extensibility — adding a new graph algorithm Two extension axes, both compiler-enforced: | Add… | Where it lives | `match` sites to update | |------|-----|---| | **New top-level search class** (e.g. range search) | New `SearchPlan` variant | `search_internal` dispatch | -| **New graph algorithm or beam modifier** | New `GraphMode` variant | `search_strategy` projection match (Rust exhaustiveness check forces the update) | +| **New graph algorithm or beam modifier** | New `GraphMode` variant | `search_strategy` projection match; `Graph(_)` arm in `search_internal` if invocation differs | -Worked example — adding `Multihop` (which already exists in `diskann::graph::search::multihop_search` and takes `&dyn QueryLabelProvider`): +### 10. Validation -```rust -// In GraphMode: -Multihop { predicate: Predicate }, -``` - -Three localized edits: - -1. Add the `GraphMode::Multihop { predicate }` arm to the projection match in `search_strategy`: `(Some(predicate), None)`. `RerankAndFilter` post-filters automatically; no beta. -2. Add a `ClosureAsLabelProvider` wrapper local to the disk crate, since `MultihopSearch::new` expects `&dyn QueryLabelProvider`: +- `beta` in `GraphMode::BetaFilter { beta }` must be in $(0, 1]$ — `GraphMode::beta_filter()` is the only constructor and panics otherwise. $\beta > 1$ would penalize matches (opposite of intent); $\beta \leq 0$ is nonsensical. +- `DiskAccessor` remains `Send` — `Predicate`'s `Send + Sync` bound propagates. +- No allocation on the `SearchPlan::graph()` and `SearchPlan::flat()` paths — both produce variants carrying `None` predicates. +- The type system forbids beta on the `FlatScan` path (no `GraphMode` field) and beta without a predicate (`BetaFilter` carries it inline). - ```rust - struct ClosureAsLabelProvider<'a>(&'a Predicate); +### 11. Testing - impl std::fmt::Debug for ClosureAsLabelProvider<'_> { - fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { - f.debug_struct("ClosureAsLabelProvider").finish_non_exhaustive() - } - } +- **Existing tests pass unchanged in spirit**: old `(None, false)` → `SearchPlan::graph()`; `(None, true)` → `SearchPlan::flat()`; `(Some(p), false)` → `SearchPlan::graph_with(GraphMode::post_filter(p))`; `(Some(p), true)` → `SearchPlan::flat_filtered(p)`. +- **Case 5 regression (beta-biased graph)**: synthetic graph where the medoid's BFS frontier provably misses a target cluster unless beta biases it. Verify `BetaFilter` returns IDs that `PostFilter` (same predicate, no beta) misses. Stat-based "matching IDs appear" assertions are flaky — the fixture must make beam ordering deterministic. +- **Recall sweep**: case 4 vs. case 5 across several $\beta$ values on a filter-selective workload. Establishes the recall-vs-effort baseline. +- **Validation**: `GraphMode::beta_filter(p, β)` panics for $\beta \notin (0, 1]$. +- **No-filter zero-cost**: counter assertion that case 3 (`SearchPlan::graph()`) invokes no closure in `pq_distances` or `RerankAndFilter`. - impl QueryLabelProvider for ClosureAsLabelProvider<'_> { - fn is_match(&self, id: u32) -> bool { (self.0)(id) } - } - ``` +### 12. Non-Goals -3. Add a `match` arm in `search_internal` (or refine the `Graph(_)` arm) that calls `MultihopSearch::new(Knn, &adapter)` instead of plain `Knn`. - -The compiler refuses to build until every `match` on `GraphMode` handles `Multihop`. +- **`BetaFilter` composability.** Beta application stays in `DiskAccessor::pq_distances` rather than moving to the `QueryComputer` path. The disk path implements the beta algorithm independently from the in-memory `BetaFilter` strategy. +- **Adopting `QueryLabelProvider` at the user-facing API.** Closures are the primary representation; future graph algorithms that internally require `&dyn QueryLabelProvider` adapt at the boundary with a thin wrapper. +- **Beta on `flat_search`.** Brute-force enumeration doesn't benefit from traversal biasing; the type system forbids it (`FlatScan` has no `GraphMode` field). +- **`VectorIdType` genericity.** `Predicate` pins `u32`. The disk path already constrains `Data::VectorIdType = u32` ([disk_provider.rs:548-557](../diskann-disk/src/search/provider/disk_provider.rs)); a future `u64` ID type would touch this API surface — accepted cost. +- **Additional `GraphMode` variants in this change.** The enum is built for extension (§9), but each new variant should land with its own consumer integration, tests, and (for traversal modifiers) recall data. ## Trade-offs @@ -286,26 +527,12 @@ The compiler refuses to build until every `match` on `GraphMode` handles `Multih ## Benchmark Results -Not yet collected. The primary risk is whether beta-biased graph search delivers a measurable recall improvement over post-filter on filter-selective workloads, and at what `β` the cost crossover sits. Planned experiments: - -1. **Recall vs. effort sweep** — case 4 (`PostFilter`) vs. case 5 (`BetaFilter`) across `β ∈ {0.3, 0.5, 0.7, 0.9}` on workloads with selectivity ∈ {1%, 10%, 50%}. Establishes when beta is worth its traversal cost. -2. **No-filter zero-cost regression** — counter assertion that case 3 (`SearchPlan::graph()`) invokes no closure in `pq_distances` or `RerankAndFilter`. -3. **`is_flat_search=true, vector_filter=None` baseline parity** — confirm `SearchPlan::flat()` matches today's `(None, true)` performance and recall exactly. - -Results will be added to this RFC before merge. +N/A ## Future Work -- [ ] **`BetaFilter` composability.** Move beta application from `DiskAccessor::pq_distances` into the `QueryComputer` path, then add an `IdFilter → QueryLabelProvider` adapter so the disk strategy can be wrapped by the in-memory `BetaFilter` strategy. Out of scope here. -- [ ] **`Multihop` integration on the disk path.** Add as a new `GraphMode` variant per the worked example in §Extensibility. -- [ ] **`u64` `VectorIdType` support.** `Predicate` pins `u32` today, matching the disk path's existing `Data::VectorIdType = u32` constraint at [disk_provider.rs:548-557](../diskann-disk/src/search/provider/disk_provider.rs). A future `u64` ID type would require generalizing `Predicate` and the projection. -- [ ] **Migration of benchmark JSON input schemas.** Remove `is_flat_search` from the disk-index schema and from the example/perf-test JSON files. Config-only, no code dependency after the backend update. +- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (matching IDs survive, non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — it does not post-filter, so non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs. ## References -1. [docs/disk-beta-filter-with-query-label-provider.md](../docs/disk-beta-filter-with-query-label-provider.md) — full design doc this RFC summarizes. -2. [docs/disk-beta-filter.md](../docs/disk-beta-filter.md) — earlier closure-based beta filter design (superseded by this RFC). -3. [diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs) — in-memory `BetaFilter` strategy; the disk path implements the algorithm independently. -4. [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) — `MultihopSearch`; the worked extensibility example. -5. [diskann-disk/src/search/provider/disk_provider.rs](../diskann-disk/src/search/provider/disk_provider.rs) — current `pq_distances`, `RerankAndFilter`, `DiskSearchStrategy`, `DiskAccessor`. -6. [diskann-benchmark/src/backend/disk_index/search.rs](../diskann-benchmark/src/backend/disk_index/search.rs) — current benchmark call site (lines 267-281). +N/A From 049d651e13384a62fe9bd1fed6f0236528dcb814 Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Mon, 25 May 2026 13:58:47 +0800 Subject: [PATCH 3/7] RFC: rename title to "Beta Filter For Disk Search" Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/00000-disk-beta-filter.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/00000-disk-beta-filter.md b/rfcs/00000-disk-beta-filter.md index ce0e7434f..c12e8c280 100644 --- a/rfcs/00000-disk-beta-filter.md +++ b/rfcs/00000-disk-beta-filter.md @@ -1,4 +1,4 @@ -# Search Plan API for Disk Search (with Beta-Biased Filtering) +# Beta Filter For Disk Search | | | |------------------|--------------------------------| From c3ae608683531765920f0844d70750efa731946a Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Mon, 25 May 2026 14:01:27 +0800 Subject: [PATCH 4/7] RFC: rename file to match PR #1101 Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/{00000-disk-beta-filter.md => 01101-disk-beta-filter.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename rfcs/{00000-disk-beta-filter.md => 01101-disk-beta-filter.md} (100%) diff --git a/rfcs/00000-disk-beta-filter.md b/rfcs/01101-disk-beta-filter.md similarity index 100% rename from rfcs/00000-disk-beta-filter.md rename to rfcs/01101-disk-beta-filter.md From 5ffc2099c916b3670b7ba0bef5cd332e40fdf724 Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Tue, 26 May 2026 14:35:39 +0800 Subject: [PATCH 5/7] RFC: incorporate review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add lifetime parameter to Predicate/SearchPlan/GraphMode so callers can pass closures borrowing non-'static data, matching the existing VectorFilter<'a, Data> flexibility. - Make GraphMode::beta_filter fallible (returns Result) so invalid beta from JSON/CLI surfaces as a validation error, not a process crash. - Migrate the benchmark JSON schemas in the same PR (no external consumers; no grace period needed). - Restructure for outside-in narrative: search() signature → projection function → DiskSearchStrategy struct change. - Trim implementation-detail sections (pq_distances code block, data-flow diagram, plumbing table, testing list) that belong in the implementation PR. - Rename "The five cases" to "Supported configurations". - Add a Future Work item: align disk vs. in-memory BetaFilter semantics (disk applies bias + post-filter; in-memory only biases). Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/01101-disk-beta-filter.md | 497 ++++++++------------------------- 1 file changed, 112 insertions(+), 385 deletions(-) diff --git a/rfcs/01101-disk-beta-filter.md b/rfcs/01101-disk-beta-filter.md index c12e8c280..25f1bd4c4 100644 --- a/rfcs/01101-disk-beta-filter.md +++ b/rfcs/01101-disk-beta-filter.md @@ -5,19 +5,19 @@ | **Authors** | yaohongdeng | | **Contributors** | | | **Created** | 2026-05-21 | -| **Updated** | 2026-05-21 | +| **Updated** | 2026-05-26 | ## Summary -Adds support for **beta-biased filtering on the disk search path** by introducing a `SearchPlan` enum that replaces the existing `(vector_filter, is_flat_search)` parameter pair on `searcher.search()`. The new enum is hierarchical (`SearchPlan { FlatScan, Graph(GraphMode) }`), encodes today's four search configurations plus the new beta-biased variant as one named value each, makes invalid combinations unrepresentable, and gives future filter algorithms a single extension point that doesn't require further growing the public `search()` signature. +Adds **beta-biased filtering on the disk search path** by introducing a `SearchPlan<'a>` enum that replaces the existing `(vector_filter, is_flat_search)` parameter pair on `searcher.search()`. The new enum is hierarchical (`SearchPlan { FlatScan, Graph(GraphMode) }`), encodes today's four search configurations plus the new beta-biased variant as one named value each, makes invalid combinations unrepresentable, and gives future filter algorithms a single extension point that doesn't require further growing the public `search()` signature. ## Motivation ### Background -The in-memory side of DiskANN has a `BetaFilter` strategy ([diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) that biases beam traversal toward vectors matching a label predicate, by multiplying their distances by a factor `β ∈ (0, 1]`. It's documented and tested for the in-memory `FullPrecision` strategy. +The in-memory side of DiskANN has a `BetaFilter` strategy ([diskann-providers/src/model/graph/provider/layers/betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) that biases beam traversal toward vectors matching a label predicate, by multiplying their distances by a factor `β ∈ (0, 1]`. The disk search path has no equivalent. -The disk search path has no equivalent. The disk API today exposes filtering as a raw closure type alias paired with a separate boolean for flat vs. graph dispatch: +The disk API today exposes filtering as a raw closure type alias paired with a separate boolean for flat vs. graph dispatch: ```rust // diskann-disk/src/build/configuration/filter_parameter.rs @@ -27,138 +27,98 @@ pub type VectorFilter<'a, Data> = // searcher.search(..., vector_filter: Option, is_flat_search: bool) ``` -A raw closure has no field to attach `β` to, and bolting beta on as a third `Option` parameter would create meaningless combinations (e.g. `is_flat_search=true` + `beta=Some(_)`) that the type system can't reject. - -Looking forward, `MultihopSearch` already exists in [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) but the disk API has no way to select it. Each new graph algorithm under the current shape would mean another boolean flag and another runtime validation rule. +`MultihopSearch` already exists in [diskann/src/graph/search/multihop_search.rs](../diskann/src/graph/search/multihop_search.rs) but the disk API has no integration point for it either. ### Problem Statement -1. **Beta-biased graph search is unavailable on the disk path.** Queries that would benefit from biasing beam traversal toward labelled vectors must fall back to plain post-filtering on the disk side, with no way to express the beta variant. +1. **Beta-biased graph search is unavailable on the disk path.** -2. **The current API shape can't carry beta cleanly.** A raw closure has no place to attach `β`. Threading `beta: Option` through `search()` → `search_internal()` → `DiskAccessor` as a fourth parameter creates orthogonality problems: beta is only meaningful in graph search, and `(vector_filter, is_flat_search, beta)` as three independent inputs admits 2 of 8 meaningless combinations the type system can't catch. +2. **The current API shape can't carry beta cleanly.** A raw closure has no place to attach `β`. Threading `beta: Option` as a fourth parameter creates orthogonality problems: beta is only meaningful in graph search, and `(vector_filter, is_flat_search, beta)` as three independent inputs admits meaningless combinations (`is_flat_search=true` + `beta=Some(_)`) the type system can't catch. -3. **No clean integration point for future filter algorithms.** `MultihopSearch` exists in `diskann` but the disk API has no extension point — adding it (or any future algorithm) would mean another flag and more cross-field validation. +3. **No clean integration point for future filter algorithms.** ### Goals -1. Add beta-biased disk graph search as a first-class capability — equivalent in effect to the in-memory `BetaFilter` strategy. -2. Replace the `(vector_filter, is_flat_search)` parameter pair with a single value where every supported configuration is one named variant or constructor. -3. Make invalid combinations (beta on flat scan, beta without a predicate) unrepresentable by construction — caught at compile time, not by runtime assertion. -4. Provide a single extension point so future filter algorithms (`MultihopSearch`, others) can be added without changing the public `search()` signature. -5. Preserve zero allocation and zero per-iteration overhead on the no-filter graph path (today's most common case). +1. Add beta-biased disk graph search as a first-class capability. +2. Provide a single extension point for future filter algorithms without changing the public `search()` signature. ## Proposal -### 1. Motivation - -The disk search filter is a raw `Box` type alias paired with a separate `is_flat_search: bool` flag: - -```rust -// diskann-disk/src/build/configuration/filter_parameter.rs -pub type VectorFilter<'a, Data> = - Box::VectorIdType) -> bool + Send + Sync + 'a>; - -// searcher.search(..., vector_filter: Option, is_flat_search: bool) -``` - -Three problems with this shape: - -1. **A raw closure can't carry `beta`.** Adding beta-biased graph search requires threading another parameter through `search()` → `search_internal()` → `DiskAccessor`. - -2. **Adding beta breaks the existing orthogonality between `is_flat_search` and `vector_filter`.** Today the two flags are genuinely orthogonal: all four `(is_flat_search, vector_filter)` combinations are meaningful, and the benchmark exposes them as independent inputs ([diskann-benchmark/src/inputs/disk.rs:83-85](../diskann-benchmark/src/inputs/disk.rs)). But beta only exists in graph search — it biases beam expansion, and a flat scan has no beam. Adding beta as a third independent parameter creates meaningless combinations (e.g. `is_flat_search=true` + `beta=Some(_)`) that the type system can't reject. - -3. **No integration point for future graph algorithms.** `MultihopSearch` already exists in `diskann/src/graph/search/multihop_search.rs`, but the disk API has no way to select it. Each new algorithm would mean another boolean flag. - -This design replaces both flags with a single `plan: SearchPlan` enum. Each of the five supported configurations is one explicit variant or constructor; future graph algorithms slot in as new `GraphMode` variants without changing the public `search()` signature; the compiler enforces exhaustiveness at every dispatch site. - -### 2. API Change - -#### 2.1 The `Predicate` type alias - -The disk path identifies filtered IDs through a closure, not a trait. There is no `IdFilter` trait, no `BitmapProvider` wrapper, and no `QueryLabelProvider` dependency. +### Types ```rust // diskann-disk/src/search/filter_parameter.rs -// (Moved from build/configuration/ — this is a search-time concept, not build-time.) - -pub type Predicate = Box bool + Send + Sync>; -``` +// (Moved from build/configuration/ — search-time concept, not build-time.) -**Why a closure, not a trait?** -- **Ergonomic at call sites.** `|id| bm.contains(id)` is shorter and clearer than `Arc::new(BitmapProvider::new(bm))`. Callers using `HashSet`, sorted `Vec`, or any custom backing structure write a closure directly. -- **No upstream coupling.** No dependency on `diskann::graph::index::QueryLabelProvider` — upstream signature changes don't reach the disk path. If a future graph algorithm (e.g. `MultihopSearch`) requires `&dyn QueryLabelProvider`, it adapts at the boundary with a thin local wrapper. -- **No allocation when absent.** `Option` is `None` for the no-filter cases; no closure object is constructed. +pub type Predicate<'a> = Box bool + Send + Sync + 'a>; -**Why `Fn(u32)` and not `Fn(&u32)`?** `u32` is 4 bytes and `Copy`; passing by value is cheaper than an 8-byte reference, eliminates `*` derefs at every call site, and matches the codebase convention of `QueryLabelProvider::is_match(id: V)` ([diskann/src/graph/index.rs:82](../diskann/src/graph/index.rs)). - -**Why `Box`, not `Arc`?** Predicates are owned by `SearchPlan` for the duration of one search call. They aren't shared across queries. `Box` is sufficient and avoids atomic refcount traffic on every clone/drop. - -#### 2.2 `SearchPlan` and `GraphMode` enums - -**Current** (`filter_parameter.rs`): -```rust -pub type VectorFilter<'a, Data> = - Box::VectorIdType) -> bool + Send + Sync + 'a>; -``` - -**Proposed**: replace the type alias with two hierarchical enums. `SearchPlan` makes the top-level graph-vs-flat-scan break; `GraphMode` describes the variant on the graph path. - -```rust /// Top-level search plan: graph traversal vs. linear scan. -pub enum SearchPlan { +pub enum SearchPlan<'a> { /// Brute-force linear scan. `Some(p)` applies `p` inline; `None` /// scans every vector (recall baseline). - FlatScan { filter: Option }, + FlatScan { filter: Option> }, /// Graph traversal; `GraphMode` picks the algorithm and any modifier. - Graph(GraphMode), + Graph(GraphMode<'a>), } /// Graph-search variant. Invalid combinations (e.g. beta without a predicate) /// are unrepresentable. -pub enum GraphMode { +pub enum GraphMode<'a> { /// Plain greedy beam. Unfiltered, /// Greedy beam + hard post-filter (applied in `RerankAndFilter`). - /// Traversal identical to `Unfiltered`. - PostFilter(Predicate), + PostFilter(Predicate<'a>), /// Beta-biased beam: matching vectors' PQ distances multiplied by /// `beta` ∈ (0, 1] in `pq_distances`. Predicate also post-filters. - BetaFilter { predicate: Predicate, beta: f32 }, + BetaFilter { predicate: Predicate<'a>, beta: f32 }, // Future graph algorithms slot in here as new variants. e.g.: - // Multihop { predicate: Predicate }, + // Multihop { predicate: Predicate<'a> }, +} + +#[derive(Debug, thiserror::Error)] +pub enum BetaError { + #[error("beta must be in (0, 1], got {0}")] + OutOfRange(f32), } -impl GraphMode { +impl<'a> GraphMode<'a> { pub fn post_filter(predicate: F) -> Self - where F: Fn(u32) -> bool + Send + Sync + 'static { + where F: Fn(u32) -> bool + Send + Sync + 'a { Self::PostFilter(Box::new(predicate)) } - pub fn beta_filter(predicate: F, beta: f32) -> Self - where F: Fn(u32) -> bool + Send + Sync + 'static { - assert!(beta > 0.0 && beta <= 1.0, "beta must be in (0, 1]"); - Self::BetaFilter { predicate: Box::new(predicate), beta } + /// Fallible — returns `BetaError::OutOfRange` if `beta` is outside (0, 1]. + /// Designed for callers that read `beta` from external input (JSON config, + /// CLI args). Programmer-supplied literals can `.unwrap()` or `?`. + pub fn beta_filter(predicate: F, beta: f32) -> Result + where F: Fn(u32) -> bool + Send + Sync + 'a { + if !(beta > 0.0 && beta <= 1.0) { + return Err(BetaError::OutOfRange(beta)); + } + Ok(Self::BetaFilter { predicate: Box::new(predicate), beta }) } } -impl SearchPlan { +impl<'a> SearchPlan<'a> { pub fn flat() -> Self { Self::FlatScan { filter: None } } pub fn flat_filtered(predicate: F) -> Self - where F: Fn(u32) -> bool + Send + Sync + 'static { + where F: Fn(u32) -> bool + Send + Sync + 'a { Self::FlatScan { filter: Some(Box::new(predicate)) } } pub fn graph() -> Self { Self::Graph(GraphMode::Unfiltered) } - pub fn graph_with(mode: GraphMode) -> Self { Self::Graph(mode) } + pub fn graph_with(mode: GraphMode<'a>) -> Self { Self::Graph(mode) } } ``` -**The five cases:** +The lifetime parameter `'a` carries the borrow scope of any captured data in the predicate. For closures that own (move in) their captures, callers don't need to write the lifetime — Rust infers `'static`. Callers that need to borrow from a stack frame for a single `search()` call get a shorter inferred `'a`, matching today's `VectorFilter<'a, Data>` flexibility. + +### Supported configurations | # | Case | `SearchPlan` value | |---|------|---| @@ -166,143 +126,27 @@ impl SearchPlan { | 2 | Flat scan + filter | `SearchPlan::flat_filtered(\|id\| bm.contains(id))` | | 3 | Graph, no filter | `SearchPlan::graph()` | | 4 | Graph + post-filter | `SearchPlan::graph_with(GraphMode::post_filter(\|id\| bm.contains(id)))` | -| 5 | Graph + beta + post-filter | `SearchPlan::graph_with(GraphMode::beta_filter(\|id\| bm.contains(id), 0.5))` | +| 5 | Graph + beta + post-filter (**new**) | `SearchPlan::graph_with(GraphMode::beta_filter(\|id\| bm.contains(id), 0.5)?)` | -**Key design decisions**: +### Key design decisions -- **Hierarchical, not flat.** `SearchPlan { FlatScan, Graph(GraphMode) }` separates the graph-vs-linear-scan categorical break (different access patterns, different code paths) from the choice of graph algorithm/modifier. A future `Multihop` slots into `GraphMode` as a sibling to `BetaFilter`, not as a top-level variant — they both traverse the graph. -- **Invalid states unrepresentable.** `BetaFilter` carries the predicate inline; the `FlatScan` path has no `GraphMode` field. Beta-without-predicate and beta-on-flat-scan are rejected at compile time, not by runtime assertion. -- **Project at the boundary.** `GraphMode` exposes constructors only — no `predicate()` or `beta()` accessors. The strategy projects `(Option<&Predicate>, Option)` from the plan via one exhaustive match at construction (§4.2). Blocks semantically meaningless calls (asking `beta()` on `Unfiltered`) and routes every new variant through a single site, compiler-enforced. +- **Hierarchical, not flat.** `SearchPlan { FlatScan, Graph(GraphMode) }` separates the graph-vs-linear-scan break (different code paths) from the choice of graph algorithm/modifier. Future graph algorithms slot into `GraphMode`, not the top-level enum. +- **Invalid states unrepresentable.** `BetaFilter` carries the predicate inline; the `FlatScan` path has no `GraphMode` field. Beta-without-predicate and beta-on-flat-scan are compile-time errors, not runtime asserts. +- **Project at the boundary.** `GraphMode` exposes constructors only — no accessors. `search_strategy()` projects `(Option<&Predicate>, Option)` from the plan via one exhaustive match. Adding a `GraphMode` variant updates exactly one match site. -#### 2.3 `search()` public signature +### `search()` signature -`is_flat_search: bool` and `vector_filter: Option` are **both removed** and replaced with a single `plan: SearchPlan` parameter. +The public entry point loses two parameters and gains one: ```rust // Before -pub fn search( - query: &[Data::VectorDataType], - return_list_size: u32, - search_list_size: u32, - beam_width: Option, - vector_filter: Option>, // Box - is_flat_search: bool, -) -> ANNResult> +pub fn search(..., vector_filter: Option>, is_flat_search: bool) -> ... // After -pub fn search( - query: &[Data::VectorDataType], - return_list_size: u32, - search_list_size: u32, - beam_width: Option, - plan: SearchPlan, -) -> ANNResult> +pub fn search(..., plan: SearchPlan<'_>) -> ... ``` -Dispatch is a single match on `plan` (see §4.3). `SearchPlan::graph()` is the default "graph, no filter" case — a unit variant inside a unit variant, no allocation. - -#### 2.4 Caller migration - -The benchmark today exposes `is_flat_search` and `vector_filters_file` as independent flags ([diskann-benchmark/src/backend/disk_index/search.rs:267-281](../diskann-benchmark/src/backend/disk_index/search.rs)). The four combinations map onto `SearchPlan` as follows: - -| Today `(vector_filter, is_flat_search)` | New `plan` | -|---|---| -| `(None, false)` | `SearchPlan::graph()` | -| `(None, true)` | `SearchPlan::flat()` | -| `(Some(p), false)` | `SearchPlan::graph_with(GraphMode::post_filter(p))` | -| `(Some(p), true)` | `SearchPlan::flat_filtered(p)` | -| (not expressible) | `SearchPlan::graph_with(GraphMode::beta_filter(p, β))` | - -The last row is **new** capability not available today. - -**Example: flat scan + hard filter migration** - -```rust -// Before — raw closure + separate boolean -let filter_list: Arc = /* bitmap of allowed IDs */; -let result = searcher.search( - query, - return_list_size, - search_list_size, - beam_width, - Some(Box::new(move |vid: &u32| filter_list.contains(*vid))), - true, // is_flat_search -)?; - -// After — single plan parameter, no boolean -let filter_list: Arc = /* bitmap of allowed IDs */; -let result = searcher.search( - query, - return_list_size, - search_list_size, - beam_width, - SearchPlan::flat_filtered(move |id| filter_list.contains(id)), -)?; -``` - -**Example: graph + post-filter migration** - -```rust -let filter_list: Arc = /* bitmap of allowed IDs */; -let result = searcher.search( - query, - return_list_size, - search_list_size, - beam_width, - SearchPlan::graph_with(GraphMode::post_filter(move |id| filter_list.contains(id))), -)?; -``` - -**Example: beta-biased graph search (new capability)** - -```rust -let active_ids: Arc = /* bitmap of non-deleted IDs */; -let result = searcher.search( - query, - return_list_size, - search_list_size, - beam_width, - SearchPlan::graph_with(GraphMode::beta_filter( - move |id| active_ids.contains(id), - 0.5, // beta: bias beam toward matching vectors - )), -)?; -``` - -**Benchmark CLI**: the `--is_flat_search` flag is removed from the disk-index benchmark input schema. The benchmark layer constructs `SearchPlan` from `(is_flat_search, vector_filters_file)` once at the boundary and passes only `SearchPlan` to `searcher.search()`. JSON input schemas that still carry `is_flat_search` should be migrated as a separate task (config-only, no code dependency after the backend update). - -### 3. Internal Plumbing - -**Changes per type:** - -| Type | Change | -|------|--------| -| `filter_parameter.rs` | Replace `VectorFilter` + `default_vector_filter()` with `Predicate`, `GraphMode`, `SearchPlan`. Move file from `build/configuration/` to `search/`. | -| `DiskSearchStrategy` | `vector_filter: &'a dyn Fn` → carries `predicate: Option<&'a Predicate>` + `beta: Option`, projected from `plan` at construction | -| `DiskAccessor` | Carries `predicate: Option<&'a Predicate>` + `beta: Option` (both extracted from `plan` at construction) | -| `DiskAccessor::pq_distances` | ~4 lines: apply `beta` when both fields are `Some` | -| `DiskAccessor::new` | Accept `predicate` + `beta` (extracted by `search_accessor()`) | -| `RerankAndFilter` | `filter: &'a dyn Fn` → `filter: Option<&'a Predicate>` | -| `search_internal` | `(vector_filter, is_flat_search)` → `plan: &SearchPlan`; dispatches on top-level variant only | -| `search_strategy` | The **only** site that introspects `GraphMode`; produces `(predicate, beta)` for the strategy fields | -| `flat_search` path | Drop redundant `vector_filter` arg; read predicate from `plan` | -| `search()` | Replace `vector_filter` + `is_flat_search` with `plan: SearchPlan` | -| `expand_beam` | **No change** | -| `distances_unordered` | **No change** | - -**No `Arc`, no allocation on the no-filter paths.** `Predicate` is `Box`; `DiskAccessor::predicate` is `Option<&'a Predicate>`. For `SearchPlan::graph()` (i.e. `GraphMode::Unfiltered`) and `SearchPlan::flat()`, both `predicate` and `beta` are `None` — no `Box` is constructed. - -### 4. Core Implementation Changes - -#### 4.1 `search()` — single dispatch parameter - -```rust -pub fn search(&self, ..., plan: SearchPlan) -> ANNResult> { - self.search_internal(query, ..., &plan) -} -``` - -#### 4.2 `search_strategy` projection — the single `GraphMode` match site +### `search_strategy` projection — the single `GraphMode` match site `search_strategy` is the **only** place that introspects `GraphMode`'s variants. One exhaustive match produces `(predicate, beta)`; the strategy carries them as fields. Every downstream consumer reads `strategy.predicate` and `strategy.beta` — no further variant matching anywhere. @@ -310,7 +154,7 @@ pub fn search(&self, ..., plan: SearchPlan) -> ANNResult> { fn search_strategy<'a>( &'a self, query: &'a [Data::VectorDataType], - plan: &'a SearchPlan, + plan: &'a SearchPlan<'a>, ) -> DiskSearchStrategy<'a, ...> { let (predicate, beta) = match plan { SearchPlan::FlatScan { filter } => (filter.as_ref(), None), @@ -323,207 +167,90 @@ fn search_strategy<'a>( } ``` -#### 4.3 `pq_distances()` — apply `beta` when both fields are `Some` - -**Current** ([disk_provider.rs:581-586](../diskann-disk/src/search/provider/disk_provider.rs)): -```rust -for (i, id) in ids.iter().enumerate() { - let distance = self.scratch.pq_scratch.aligned_dist_scratch[i]; - f(distance, *id); -} -``` - -**Proposed**: -```rust -for (i, id) in ids.iter().enumerate() { - let mut distance = self.scratch.pq_scratch.aligned_dist_scratch[i]; - if let (Some(beta), Some(predicate)) = (self.beta, self.predicate) { - if predicate(*id) { - distance *= beta; - } - } - f(distance, *id); -} -``` +The compiler's exhaustiveness check forces every new `GraphMode` variant to add exactly one arm here. `DiskAccessor::new` and `RerankAndFilter::new` consume `strategy.predicate` and `strategy.beta` directly — no further `GraphMode` knowledge downstream. -When `beta` or `predicate` is `None`, the `if let` short-circuits — zero overhead on cases 3 (`Unfiltered`) and 4 (`PostFilter`). +### `DiskSearchStrategy` struct change -#### 4.4 `search_internal()` — dispatch by top-level variant - -`search_internal` dispatches by the top-level `SearchPlan` variant only — the `GraphMode` match already happened in `search_strategy`. +The strategy stops carrying a single closure and starts carrying the two projected fields: ```rust -pub(crate) fn search_internal( - &self, ..., plan: &SearchPlan, -) -> ANNResult { - let strategy = self.search_strategy(query, plan); - let stats = match plan { - SearchPlan::FlatScan { filter } => { - let accept_all: &(dyn Fn(u32) -> bool + Send + Sync) = &|_| true; - let accept: &(dyn Fn(u32) -> bool + Send + Sync) = match filter { - Some(p) => &**p, - None => accept_all, - }; - self.runtime.block_on(self.index.flat_search( - &strategy, &DefaultContext, strategy.query, accept, - &Knn::new(k, l, beam_width)?, &mut result_output_buffer, - ))? - } - SearchPlan::Graph(_) => { - let knn = Knn::new(k, l, beam_width)?; - self.runtime.block_on(self.index.search( - knn, &strategy, &DefaultContext, strategy.query, &mut result_output_buffer, - ))? - } - }; - ... -} -``` - -The `Graph(_)` arm doesn't need to look at the `GraphMode` — `pq_distances` already knows about `beta` via `strategy`, and `RerankAndFilter` already knows about `predicate` via `strategy`. - -#### 4.5 `RerankAndFilter` — optional predicate +pub struct DiskSearchStrategy<'a, Data, ProviderFactory> +where + Data: GraphDataType, + ProviderFactory: VertexProviderFactory, +{ + io_tracker: IOTracker, + query: &'a [Data::VectorDataType], -```rust -pub struct RerankAndFilter<'a> { - filter: Option<&'a Predicate>, -} + // === Changed === + // Removed: vector_filter: &'a VectorFilter<'a, Data> + // Added: projected from `plan` once in search_strategy() + predicate: Option<&'a Predicate<'a>>, + beta: Option, -// in DiskSearchStrategy::default_post_processor(): -fn default_post_processor(&self) -> RerankAndFilter<'_> { - RerankAndFilter::new(self.predicate) + // === Unchanged === + vertex_provider_factory: &'a ProviderFactory, + scratch_pool: &'a Arc>>, } - -// in RerankAndFilter::post_process(): -.filter(|id| self.filter.map_or(true, |f| f(*id))) ``` -`map_or(true, ...)` short-circuits with no closure invocation when `filter` is `None`. - -### 5. Where Beta Is / Isn't Applied - -| Location | Beta applied? | Rationale | -|----------|--------------|-----------| -| `pq_distances()` | **Yes**, only on `GraphMode::BetaFilter` | Biases beam toward matching vectors during graph traversal | -| `ensure_loaded()` (full-precision cache) | No | Cache must hold true distances for honest reranking | -| `RerankAndFilter::post_process()` | No | Uses true distances; predicate is hard-filter only | -| `flat_search` path | No | Type system forbids it — `FlatScan` has no `GraphMode` field | +### Caller migration -### 6. Data Flow - -``` -search(plan: SearchPlan) - ▼ -search_internal(plan: &SearchPlan) - │ - ├─► step 1: strategy = search_strategy(query, plan) - │ // exhaustive match on GraphMode projects (predicate, beta) once; - │ // strategy carries the projection — nothing downstream reads GraphMode - │ - └─► step 2: dispatch on top-level variant - │ - ├─[FlatScan { filter }]─► index.flat_search(..., accept-or-accept-all, ...) - │ (no GraphMode on this path; type system forbids beta) - │ - └─[Graph(_)]─► index.search(Knn, &strategy, ctx, query, output) - // index.search drives the traversal and calls back into the - // strategy on demand: - │ - ├─► strategy.search_accessor(...) → DiskAccessor - │ reads strategy.predicate, strategy.beta - │ │ - │ └─► accessor.pq_distances() — invoked per beam expansion - │ if let (Some(beta), Some(p)) = (self.beta, self.predicate) { - │ if p(*id) { distance *= beta } - │ } - │ - └─► strategy.default_post_processor() → RerankAndFilter - // invoked once on the final candidate set - filter = strategy.predicate -``` - -### 7. Files Modified - -| File | Scope of change | -|------|-----------------| -| `diskann-disk/src/search/filter_parameter.rs` (moved from `build/configuration/`) | New types: `Predicate`, `GraphMode`, `SearchPlan` replace `VectorFilter` and `default_vector_filter()` | -| `diskann-disk/src/search/provider/disk_provider.rs` | `DiskSearchStrategy`, `DiskAccessor`, `pq_distances`, `RerankAndFilter`, `search_internal`, `search_strategy`, `flat_search`, `search()`, internal test call sites | -| `diskann-benchmark/src/backend/disk_index/search.rs` | Build `SearchPlan` at the boundary; drop `is_flat_search` parameter | -| `tools/search_disk_index.rs` (or equivalent) | Same closure-to-`SearchPlan` replacement | -| Benchmark input JSON schemas | Remove `is_flat_search`; encode the plan in inputs instead (config-only, no code dependency after the backend update) | +| Today `(vector_filter, is_flat_search)` | New `plan` | +|---|---| +| `(None, false)` | `SearchPlan::graph()` | +| `(None, true)` | `SearchPlan::flat()` | +| `(Some(p), false)` | `SearchPlan::graph_with(GraphMode::post_filter(p))` | +| `(Some(p), true)` | `SearchPlan::flat_filtered(p)` | +| (not expressible) | `SearchPlan::graph_with(GraphMode::beta_filter(p, β)?)` | -### 8. Backward Compatibility +Boundary code in the benchmark and tools constructs `SearchPlan` once and passes it down. The `--is_flat_search` benchmark flag and the corresponding `is_flat_search` field in benchmark JSON schemas (`diskann-benchmark/example/*.json`, `diskann-benchmark/perf_test_inputs/*.json`) are removed in the same change — no external consumers depend on the old schema, so no migration grace period is needed. -- **Existing no-filter callers**: replace `vector_filter: None, is_flat_search: false` with `SearchPlan::graph()` — identical behavior, no allocation. -- **Existing flat-scan-no-filter callers** (benchmark recall baseline): replace `vector_filter: None, is_flat_search: true` with `SearchPlan::flat()`. -- **Existing filtered callers (graph + post-filter)**: replace `Some(Box::new(|vid| bm.contains(vid))), is_flat_search: false` with `SearchPlan::graph_with(GraphMode::post_filter(move |id| bm.contains(id)))`. -- **Existing flat-scan-with-filter callers**: replace `Some(Box::new(|vid| bm.contains(vid))), is_flat_search: true` with `SearchPlan::flat_filtered(move |id| bm.contains(id))`. -- **`search_internal` is `pub(crate)`** — no external API breakage beyond `search()`. -- **Zero overhead on the no-filter graph path** — `SearchPlan::graph()` constructs no `Box`; `pq_distances`' `if let` does not match; `RerankAndFilter::filter` is `None` and the `.filter` call short-circuits. +### Where beta is/isn't applied -### 9. Extensibility — adding a new graph algorithm +| Location | Beta? | Why | +|----------|-------|-----| +| `pq_distances()` | **Yes**, only on `GraphMode::BetaFilter` | Biases beam toward matching vectors during traversal | +| Full-precision cache | No | Cache holds true distances for honest reranking | +| `RerankAndFilter::post_process()` | No | True distances; predicate is hard-filter only | +| `flat_search` path | No | Type system forbids it — `FlatScan` has no `GraphMode` field | -Two extension axes, both compiler-enforced: +### Extensibility | Add… | Where it lives | `match` sites to update | |------|-----|---| -| **New top-level search class** (e.g. range search) | New `SearchPlan` variant | `search_internal` dispatch | -| **New graph algorithm or beam modifier** | New `GraphMode` variant | `search_strategy` projection match; `Graph(_)` arm in `search_internal` if invocation differs | +| New top-level search class (e.g. range search) | New `SearchPlan` variant | `search_internal` dispatch | +| New graph algorithm or beam modifier | New `GraphMode` variant | `search_strategy` projection; `Graph(_)` arm in `search_internal` if invocation differs | + +The exhaustiveness check guarantees the compiler refuses to build until every `match` on `GraphMode` handles the new variant — no silent fallback. -### 10. Validation +### Validation -- `beta` in `GraphMode::BetaFilter { beta }` must be in $(0, 1]$ — `GraphMode::beta_filter()` is the only constructor and panics otherwise. $\beta > 1$ would penalize matches (opposite of intent); $\beta \leq 0$ is nonsensical. -- `DiskAccessor` remains `Send` — `Predicate`'s `Send + Sync` bound propagates. -- No allocation on the `SearchPlan::graph()` and `SearchPlan::flat()` paths — both produce variants carrying `None` predicates. -- The type system forbids beta on the `FlatScan` path (no `GraphMode` field) and beta without a predicate (`BetaFilter` carries it inline). +- `β ∈ (0, 1]` enforced by `GraphMode::beta_filter()`, the only constructor; returns `BetaError::OutOfRange` on invalid input (no panic). This is the validation point for `β` values read from JSON/CLI config — boundary code propagates the error to the user. +- Beta on `FlatScan` and beta without a predicate are compile-time errors by enum shape. +- No allocation on `SearchPlan::graph()` or `SearchPlan::flat()` — both produce variants carrying `None` predicates. -### 11. Testing +### Files modified -- **Existing tests pass unchanged in spirit**: old `(None, false)` → `SearchPlan::graph()`; `(None, true)` → `SearchPlan::flat()`; `(Some(p), false)` → `SearchPlan::graph_with(GraphMode::post_filter(p))`; `(Some(p), true)` → `SearchPlan::flat_filtered(p)`. -- **Case 5 regression (beta-biased graph)**: synthetic graph where the medoid's BFS frontier provably misses a target cluster unless beta biases it. Verify `BetaFilter` returns IDs that `PostFilter` (same predicate, no beta) misses. Stat-based "matching IDs appear" assertions are flaky — the fixture must make beam ordering deterministic. -- **Recall sweep**: case 4 vs. case 5 across several $\beta$ values on a filter-selective workload. Establishes the recall-vs-effort baseline. -- **Validation**: `GraphMode::beta_filter(p, β)` panics for $\beta \notin (0, 1]$. -- **No-filter zero-cost**: counter assertion that case 3 (`SearchPlan::graph()`) invokes no closure in `pq_distances` or `RerankAndFilter`. +| File | Scope | +|------|-------| +| `diskann-disk/src/search/filter_parameter.rs` (moved from `build/configuration/`) | New types replace `VectorFilter` | +| `diskann-disk/src/search/provider/disk_provider.rs` | `DiskSearchStrategy`, `DiskAccessor`, `pq_distances`, `RerankAndFilter`, `search_internal`, `search_strategy`, `flat_search`, `search()`, test call sites | +| `diskann-benchmark/src/backend/disk_index/search.rs` | Build `SearchPlan` at the boundary | +| `tools/search_disk_index.rs` (or equivalent) | Same closure-to-`SearchPlan` replacement | +| Benchmark input JSON schemas | Remove `is_flat_search` | -### 12. Non-Goals +### Non-Goals -- **`BetaFilter` composability.** Beta application stays in `DiskAccessor::pq_distances` rather than moving to the `QueryComputer` path. The disk path implements the beta algorithm independently from the in-memory `BetaFilter` strategy. -- **Adopting `QueryLabelProvider` at the user-facing API.** Closures are the primary representation; future graph algorithms that internally require `&dyn QueryLabelProvider` adapt at the boundary with a thin wrapper. -- **Beta on `flat_search`.** Brute-force enumeration doesn't benefit from traversal biasing; the type system forbids it (`FlatScan` has no `GraphMode` field). +- **`BetaFilter` composability.** Beta application stays in `DiskAccessor::pq_distances` rather than moving to the `QueryComputer` path. The disk path implements beta independently from the in-memory `BetaFilter` strategy. +- **Adopting `QueryLabelProvider` at the user-facing API.** Closures are the primary representation; future algorithms that internally require `&dyn QueryLabelProvider` adapt at the boundary with a thin local wrapper. +- **Beta on `flat_search`.** Brute-force enumeration doesn't benefit from traversal biasing; the type system forbids it. - **`VectorIdType` genericity.** `Predicate` pins `u32`. The disk path already constrains `Data::VectorIdType = u32` ([disk_provider.rs:548-557](../diskann-disk/src/search/provider/disk_provider.rs)); a future `u64` ID type would touch this API surface — accepted cost. -- **Additional `GraphMode` variants in this change.** The enum is built for extension (§9), but each new variant should land with its own consumer integration, tests, and (for traversal modifiers) recall data. +- **Additional `GraphMode` variants in this change.** The enum is built for extension, but each new variant should land with its own consumer integration, tests, and (for traversal modifiers) recall data. ## Trade-offs -### Closure (`Box bool>`) vs. `Arc>` trait - -**Chosen: closure.** Ergonomic at call sites (`|id| bm.contains(id)` vs. `Arc::new(BitmapProvider::new(bm))`). No coupling to a trait we don't own — upstream changes to `QueryLabelProvider` don't reach the disk path. No allocation on the no-filter path. The downside is that future algorithms requiring `&dyn QueryLabelProvider` need a thin adapter (~15 lines, scoped to the disk crate) — see the `Multihop` example above. - -**Alternative considered: `Arc + Send + Sync>`.** Would make `BetaFilter` *type-valid* immediately — same `Arc` shape the in-memory `BetaFilter::new()` already takes. Rejected because the underlying composability also requires moving beta application from `DiskAccessor::pq_distances` into the `QueryComputer` path, which is a separate (large) refactor. Type-valid composability is not behavioural composability, and the in-memory `QueryLabelProvider` will soon change anyway. Disk closures stay decoupled. - -### Hierarchical `SearchPlan { FlatScan, Graph(GraphMode) }` vs. flat `SearchPlan { FlatScanNoFilter, FlatScanFiltered, GraphNoFilter, GraphPostFilter, GraphBeta }` - -**Chosen: hierarchical.** Separates the graph-vs-linear-scan categorical break (different code paths) from the choice of graph algorithm/modifier. New graph algorithms (e.g. `Multihop`) slot into `GraphMode`, not into the top-level enum — they all traverse the graph and share dispatch. - -**Alternative considered: a flat enum.** One variant per case (5 today, more as algorithms are added). Slightly less nesting at call sites but no clear extension axis: a new graph algorithm has to pick a name pattern (`GraphMultihop`? `GraphMultihopFiltered`?). The hierarchy makes the "this is a graph algorithm" attribute explicit. - -### `Box` vs. `Arc` for `Predicate` - -**Chosen: `Box`.** Predicates are owned by `SearchPlan` for the duration of one search call; not shared across queries. `Box` is sufficient and avoids atomic refcount traffic. - -### `Fn(u32)` vs. `Fn(&u32)` - -**Chosen: `Fn(u32)`.** `u32` is 4 bytes and `Copy`; by-value is cheaper than an 8-byte reference, eliminates `*` derefs at every call site, and matches the codebase convention of `QueryLabelProvider::is_match(id: V)` ([diskann/src/graph/index.rs:82](../diskann/src/graph/index.rs)). - -### Project at strategy construction vs. defer projection - -**Chosen: project at strategy construction.** `search_strategy` is the single match site for `GraphMode`; the strategy carries `(predicate, beta)` as fields, and every downstream consumer reads those fields directly. Adding a `GraphMode` variant requires updating one `match`, not three. - -**Alternative considered: keep `&SearchPlan` in the strategy, project later.** Downstream consumers (`DiskAccessor`, `RerankAndFilter`, `flat_search` path) each re-match. More flexible but each new `GraphMode` variant touches more sites. - -### `assert!` in `GraphMode::beta_filter` vs. `Result` - -**Chosen: `assert!`.** Caller-side bug, not a runtime condition. Returning `Result` would force every caller to handle an error that, if it ever fires, indicates a programming mistake (passing literal `0.0` or `2.0` as `beta`). The constructor panicking on the calling thread is acceptable. +N/A ## Benchmark Results @@ -531,7 +258,7 @@ N/A ## Future Work -- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (matching IDs survive, non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — it does not post-filter, so non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs. +- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs. ## References From 136fb38c1d159e0d31abd110ea684a237d9cd409 Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Tue, 26 May 2026 21:28:15 +0800 Subject: [PATCH 6/7] RFC: address review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Introduce FilterMode<'a> sum type at the strategy layer: replaces the (Option<&Predicate>, Option) tuple with one enum (None | Filter | BetaFilter), making (None, Some(β)) unrepresentable and extending better for future variants (Multihop, LabelBeta, ...). - Fix the internal/external ID mismatch in RerankAndFilter::post_process: predicate is keyed on ExternalId per DataProvider contract; flat_search is correct; RerankAndFilter and the new pq_distances both add to_external_id conversion. Today's identity mapping hides the bug. - Add Non-Goal: newtyping DiskProvider::ExternalId is deferred. Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/01101-disk-beta-filter.md | 111 +++++++++++++++++++++++++-------- 1 file changed, 85 insertions(+), 26 deletions(-) diff --git a/rfcs/01101-disk-beta-filter.md b/rfcs/01101-disk-beta-filter.md index 25f1bd4c4..2da495106 100644 --- a/rfcs/01101-disk-beta-filter.md +++ b/rfcs/01101-disk-beta-filter.md @@ -40,7 +40,6 @@ pub type VectorFilter<'a, Data> = ### Goals 1. Add beta-biased disk graph search as a first-class capability. -2. Provide a single extension point for future filter algorithms without changing the public `search()` signature. ## Proposal @@ -118,6 +117,8 @@ impl<'a> SearchPlan<'a> { The lifetime parameter `'a` carries the borrow scope of any captured data in the predicate. For closures that own (move in) their captures, callers don't need to write the lifetime — Rust infers `'static`. Callers that need to borrow from a stack frame for a single `search()` call get a shorter inferred `'a`, matching today's `VectorFilter<'a, Data>` flexibility. +The `u32` argument to a `Predicate` is the **external ID**. See §ID convention for the rationale and the call sites updated in this change. + ### Supported configurations | # | Case | `SearchPlan` value | @@ -132,7 +133,7 @@ The lifetime parameter `'a` carries the borrow scope of any captured data in the - **Hierarchical, not flat.** `SearchPlan { FlatScan, Graph(GraphMode) }` separates the graph-vs-linear-scan break (different code paths) from the choice of graph algorithm/modifier. Future graph algorithms slot into `GraphMode`, not the top-level enum. - **Invalid states unrepresentable.** `BetaFilter` carries the predicate inline; the `FlatScan` path has no `GraphMode` field. Beta-without-predicate and beta-on-flat-scan are compile-time errors, not runtime asserts. -- **Project at the boundary.** `GraphMode` exposes constructors only — no accessors. `search_strategy()` projects `(Option<&Predicate>, Option)` from the plan via one exhaustive match. Adding a `GraphMode` variant updates exactly one match site. +- **Project at the boundary into a sum type.** `GraphMode` exposes constructors only — no accessors. `search_strategy()` projects the plan into a `FilterMode<'a>` sum type via one exhaustive match (see below). Adding a `GraphMode` variant updates exactly one match site. The sum-type shape (rather than two independent `Option` fields) is what makes `(None, Some(β))` unrepresentable at the strategy layer — same pattern as the public `GraphMode`, extended one layer down. ### `search()` signature @@ -148,7 +149,26 @@ pub fn search(..., plan: SearchPlan<'_>) -> ... ### `search_strategy` projection — the single `GraphMode` match site -`search_strategy` is the **only** place that introspects `GraphMode`'s variants. One exhaustive match produces `(predicate, beta)`; the strategy carries them as fields. Every downstream consumer reads `strategy.predicate` and `strategy.beta` — no further variant matching anywhere. +`search_strategy` is the **only** place that introspects `GraphMode`'s variants. One exhaustive match produces a `FilterMode<'a>` value, which the strategy stores as a single field. Every downstream consumer matches on `strategy.filter` — no further `GraphMode` knowledge anywhere. + +The projection target is a disk-local sum type that captures the three filter shapes the strategy needs to dispatch on. Each variant carries exactly the fields it uses: + +```rust +// Disk-local; lives in filter_parameter.rs alongside SearchPlan/GraphMode. +pub(crate) enum FilterMode<'a> { + None, + Filter(&'a (dyn Fn(u32) -> bool + Send + Sync + 'a)), + BetaFilter { + predicate: &'a (dyn Fn(u32) -> bool + Send + Sync + 'a), + beta: f32, + }, + // Future graph algorithms with their own supplementary data slot in here: + // Multihop { predicate, depth: u32 }, + // LabelBeta { labels: &'a LabelMap, beta: f32 }, +} +``` + +Variants hold `&'a dyn Fn(...)` directly (not `&'a Box`), so calling the predicate is one indirection instead of two. The owning `Box` still lives in `SearchPlan`; the strategy projects via `predicate.as_ref()` in the match. ```rust fn search_strategy<'a>( @@ -156,22 +176,27 @@ fn search_strategy<'a>( query: &'a [Data::VectorDataType], plan: &'a SearchPlan<'a>, ) -> DiskSearchStrategy<'a, ...> { - let (predicate, beta) = match plan { - SearchPlan::FlatScan { filter } => (filter.as_ref(), None), - SearchPlan::Graph(GraphMode::Unfiltered) => (None, None), - SearchPlan::Graph(GraphMode::PostFilter(p)) => (Some(p), None), + let filter = match plan { + SearchPlan::FlatScan { filter: None } + | SearchPlan::Graph(GraphMode::Unfiltered) => FilterMode::None, + + SearchPlan::FlatScan { filter: Some(p) } + | SearchPlan::Graph(GraphMode::PostFilter(p)) => FilterMode::Filter(p.as_ref()), + SearchPlan::Graph(GraphMode::BetaFilter { predicate, beta }) => - (Some(predicate), Some(*beta)), + FilterMode::BetaFilter { predicate: predicate.as_ref(), beta: *beta }, }; - DiskSearchStrategy { predicate, beta, query, ... } + DiskSearchStrategy { filter, query, ... } } ``` -The compiler's exhaustiveness check forces every new `GraphMode` variant to add exactly one arm here. `DiskAccessor::new` and `RerankAndFilter::new` consume `strategy.predicate` and `strategy.beta` directly — no further `GraphMode` knowledge downstream. +The exhaustiveness check forces every new `GraphMode` variant to add exactly one arm. `DiskAccessor` and `RerankAndFilter` read `strategy.filter` and dispatch on the `FilterMode` variant — they never look at `GraphMode`. + +**Why a sum type instead of `(Option<&Predicate>, Option)` on the strategy?** Two independent `Option` fields would admit a structurally-invalid `(None, Some(β))` state, prevented today only by the exhaustive `match` here. A sum type makes it unrepresentable, and extends better than nesting `beta` inside a struct: each future variant (e.g. `Multihop { predicate, depth }`) carries exactly its own supplementary data, where a struct-with-Options shape would degrade the moment a modifier needs non-`f32` supplementary data or no predicate at all. ### `DiskSearchStrategy` struct change -The strategy stops carrying a single closure and starts carrying the two projected fields: +The strategy stops carrying a single closure and starts carrying the projected `FilterMode`: ```rust pub struct DiskSearchStrategy<'a, Data, ProviderFactory> @@ -185,8 +210,7 @@ where // === Changed === // Removed: vector_filter: &'a VectorFilter<'a, Data> // Added: projected from `plan` once in search_strategy() - predicate: Option<&'a Predicate<'a>>, - beta: Option, + filter: FilterMode<'a>, // === Unchanged === vertex_provider_factory: &'a ProviderFactory, @@ -194,6 +218,25 @@ where } ``` +Consumer sites then dispatch directly on the variant: + +```rust +// pq_distances: +if let FilterMode::BetaFilter { predicate, beta } = self.filter { + if predicate(id) { + distance *= beta; + } +} + +// RerankAndFilter — extracting the post-filter predicate uniformly: +match &self.filter { + FilterMode::None => None, + FilterMode::Filter(p) | FilterMode::BetaFilter { predicate: p, .. } => Some(*p), +} +``` + +Exhaustiveness forces every future `FilterMode` variant through both sites. + ### Caller migration | Today `(vector_filter, is_flat_search)` | New `plan` | @@ -206,6 +249,31 @@ where Boundary code in the benchmark and tools constructs `SearchPlan` once and passes it down. The `--is_flat_search` benchmark flag and the corresponding `is_flat_search` field in benchmark JSON schemas (`diskann-benchmark/example/*.json`, `diskann-benchmark/perf_test_inputs/*.json`) are removed in the same change — no external consumers depend on the old schema, so no migration grace period is needed. +### ID convention — fix the internal/external mismatch in `RerankAndFilter` + +Current state: + +- **Input predicate is keyed on `ExternalId`** by `DataProvider` contract. +- **`Index::flat_search`** is correct — calls `to_external_id` before invoking the predicate ([diskann/src/graph/index.rs](../diskann/src/graph/index.rs)). +- **`RerankAndFilter::post_process`** is incorrect — invokes the predicate with `Neighbor::id` (the **internal ID**) with no conversion ([disk_provider.rs:315](../diskann-disk/src/search/provider/disk_provider.rs)). Works today only because `DiskProvider::to_internal_id` and `to_external_id` are identity ([disk_provider.rs:93-124](../diskann-disk/src/search/provider/disk_provider.rs)) — `internal_id == external_id == u32`. A non-identity disk provider would silently produce different results between flat-scan and graph paths. + +This RFC fixes the mismatch: every site invokes the predicate with `ExternalId`. + +**Sites updated in the implementation PR:** + +- **`RerankAndFilter::post_process`** — call `to_external_id` on each candidate before invoking the predicate. +- **`pq_distances`** (new code path for `FilterMode::BetaFilter`) — call `to_external_id` on each id before invoking the predicate. + +**Sites already correct:** + +- **`Index::flat_search`** — already converts to external before invoking. + +**Cost** in the identity case is negligible: `to_external_id` returns `Ok(id)`; `#[inline]` lets LLVM fold it, leaving only a `Result` tag-check that the inliner constant-folds when it sees the `Ok(_)` body. In the PQ-distance hot loop this is branch-predicted free. + +**Caller impact:** none. Bitmaps that callers build from prior search results are already keyed on external IDs by the `DataProvider` contract — they were the *intended* input all along; this RFC just makes every invocation site honor it. + +**Caveat:** the "external" contract stays documented but not type-enforced — both ID kinds are bare `u32` today. Making it type-level (newtyping `ExternalId`) is deferred (see §Non-Goals). + ### Where beta is/isn't applied | Location | Beta? | Why | @@ -227,8 +295,8 @@ The exhaustiveness check guarantees the compiler refuses to build until every `m ### Validation - `β ∈ (0, 1]` enforced by `GraphMode::beta_filter()`, the only constructor; returns `BetaError::OutOfRange` on invalid input (no panic). This is the validation point for `β` values read from JSON/CLI config — boundary code propagates the error to the user. -- Beta on `FlatScan` and beta without a predicate are compile-time errors by enum shape. -- No allocation on `SearchPlan::graph()` or `SearchPlan::flat()` — both produce variants carrying `None` predicates. +- Beta on `FlatScan` and beta without a predicate are compile-time errors at both layers: unrepresentable in the public `GraphMode` enum, and unrepresentable in the internal `FilterMode` sum type. +- No allocation on `SearchPlan::graph()` or `SearchPlan::flat()` — both project to `FilterMode::None`. ### Files modified @@ -247,19 +315,10 @@ The exhaustiveness check guarantees the compiler refuses to build until every `m - **Beta on `flat_search`.** Brute-force enumeration doesn't benefit from traversal biasing; the type system forbids it. - **`VectorIdType` genericity.** `Predicate` pins `u32`. The disk path already constrains `Data::VectorIdType = u32` ([disk_provider.rs:548-557](../diskann-disk/src/search/provider/disk_provider.rs)); a future `u64` ID type would touch this API surface — accepted cost. - **Additional `GraphMode` variants in this change.** The enum is built for extension, but each new variant should land with its own consumer integration, tests, and (for traversal modifiers) recall data. +- **Newtyping `DiskProvider::ExternalId`.** The "predicate sees external IDs" contract (§Types) is documented but not type-enforced — both ID kinds are bare `u32` today. Making the distinction type-level (e.g. `struct ExternalId(u32);`) would catch accidental cross-space passing at compile time but touches the broader `DataProvider` surface beyond this RFC's scope. Deferred. -## Trade-offs - -N/A - -## Benchmark Results - -N/A ## Future Work -- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs. - -## References +- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs. **Note**: keying the disk `Predicate` on `ExternalId` (§Types) already closes one corner of this divergence — both sides now address the caller's ID space. -N/A From 9379b39d72b5da5546409d6b4de7bf44f0e28560 Mon Sep 17 00:00:00 2001 From: yaohongdeng Date: Wed, 27 May 2026 18:56:19 +0800 Subject: [PATCH 7/7] RFC: address review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Switch the documented ID convention back to internal IDs, matching the implementation: all three invocation sites (flat_search, pq_distances, RerankAndFilter) pass internal IDs uniformly; no to_external_id conversion at the predicate boundary. Identity mapping (InternalId == ExternalId == u32) makes this transparent to callers either way. - Remove the §ID convention section, the Non-Goal about newtyping ExternalId, and the related note in the Future Work item -- all obsolete after the contract flip. - Trim the "Project at the boundary" bullet in §Key design decisions: it dove into FilterMode internals before the reader had context, and the rationale already lives in §search_strategy projection. Replace with a shorter bullet phrased as "variant matching is delegated to search_strategy()" -- avoids the "accessor" terminology clash with the repo's existing Accessor trait. Co-Authored-By: Claude Opus 4.7 (1M context) --- rfcs/01101-disk-beta-filter.md | 33 ++++----------------------------- 1 file changed, 4 insertions(+), 29 deletions(-) diff --git a/rfcs/01101-disk-beta-filter.md b/rfcs/01101-disk-beta-filter.md index 2da495106..3c9e403bf 100644 --- a/rfcs/01101-disk-beta-filter.md +++ b/rfcs/01101-disk-beta-filter.md @@ -40,6 +40,7 @@ pub type VectorFilter<'a, Data> = ### Goals 1. Add beta-biased disk graph search as a first-class capability. +2. Provide a single extension point for future filter algorithms without changing the public `search()` signature ## Proposal @@ -117,7 +118,7 @@ impl<'a> SearchPlan<'a> { The lifetime parameter `'a` carries the borrow scope of any captured data in the predicate. For closures that own (move in) their captures, callers don't need to write the lifetime — Rust infers `'static`. Callers that need to borrow from a stack frame for a single `search()` call get a shorter inferred `'a`, matching today's `VectorFilter<'a, Data>` flexibility. -The `u32` argument to a `Predicate` is the **external ID**. See §ID convention for the rationale and the call sites updated in this change. +The `u32` argument to a `Predicate` is the **internal ID** (`DiskProvider::InternalId`). All three invocation sites (`flat_search`, `pq_distances`, `RerankAndFilter::post_process`) pass internal IDs uniformly — no `to_external_id` conversion at the predicate boundary. ### Supported configurations @@ -133,7 +134,7 @@ The `u32` argument to a `Predicate` is the **external ID**. See §ID convention - **Hierarchical, not flat.** `SearchPlan { FlatScan, Graph(GraphMode) }` separates the graph-vs-linear-scan break (different code paths) from the choice of graph algorithm/modifier. Future graph algorithms slot into `GraphMode`, not the top-level enum. - **Invalid states unrepresentable.** `BetaFilter` carries the predicate inline; the `FlatScan` path has no `GraphMode` field. Beta-without-predicate and beta-on-flat-scan are compile-time errors, not runtime asserts. -- **Project at the boundary into a sum type.** `GraphMode` exposes constructors only — no accessors. `search_strategy()` projects the plan into a `FilterMode<'a>` sum type via one exhaustive match (see below). Adding a `GraphMode` variant updates exactly one match site. The sum-type shape (rather than two independent `Option` fields) is what makes `(None, Some(β))` unrepresentable at the strategy layer — same pattern as the public `GraphMode`, extended one layer down. +- **All `GraphMode` variant matching is delegated to `search_strategy()`.** That's the only site that introspects the enum (see §`search_strategy` projection), so adding a new variant updates exactly one match arm. ### `search()` signature @@ -249,31 +250,6 @@ Exhaustiveness forces every future `FilterMode` variant through both sites. Boundary code in the benchmark and tools constructs `SearchPlan` once and passes it down. The `--is_flat_search` benchmark flag and the corresponding `is_flat_search` field in benchmark JSON schemas (`diskann-benchmark/example/*.json`, `diskann-benchmark/perf_test_inputs/*.json`) are removed in the same change — no external consumers depend on the old schema, so no migration grace period is needed. -### ID convention — fix the internal/external mismatch in `RerankAndFilter` - -Current state: - -- **Input predicate is keyed on `ExternalId`** by `DataProvider` contract. -- **`Index::flat_search`** is correct — calls `to_external_id` before invoking the predicate ([diskann/src/graph/index.rs](../diskann/src/graph/index.rs)). -- **`RerankAndFilter::post_process`** is incorrect — invokes the predicate with `Neighbor::id` (the **internal ID**) with no conversion ([disk_provider.rs:315](../diskann-disk/src/search/provider/disk_provider.rs)). Works today only because `DiskProvider::to_internal_id` and `to_external_id` are identity ([disk_provider.rs:93-124](../diskann-disk/src/search/provider/disk_provider.rs)) — `internal_id == external_id == u32`. A non-identity disk provider would silently produce different results between flat-scan and graph paths. - -This RFC fixes the mismatch: every site invokes the predicate with `ExternalId`. - -**Sites updated in the implementation PR:** - -- **`RerankAndFilter::post_process`** — call `to_external_id` on each candidate before invoking the predicate. -- **`pq_distances`** (new code path for `FilterMode::BetaFilter`) — call `to_external_id` on each id before invoking the predicate. - -**Sites already correct:** - -- **`Index::flat_search`** — already converts to external before invoking. - -**Cost** in the identity case is negligible: `to_external_id` returns `Ok(id)`; `#[inline]` lets LLVM fold it, leaving only a `Result` tag-check that the inliner constant-folds when it sees the `Ok(_)` body. In the PQ-distance hot loop this is branch-predicted free. - -**Caller impact:** none. Bitmaps that callers build from prior search results are already keyed on external IDs by the `DataProvider` contract — they were the *intended* input all along; this RFC just makes every invocation site honor it. - -**Caveat:** the "external" contract stays documented but not type-enforced — both ID kinds are bare `u32` today. Making it type-level (newtyping `ExternalId`) is deferred (see §Non-Goals). - ### Where beta is/isn't applied | Location | Beta? | Why | @@ -315,10 +291,9 @@ The exhaustiveness check guarantees the compiler refuses to build until every `m - **Beta on `flat_search`.** Brute-force enumeration doesn't benefit from traversal biasing; the type system forbids it. - **`VectorIdType` genericity.** `Predicate` pins `u32`. The disk path already constrains `Data::VectorIdType = u32` ([disk_provider.rs:548-557](../diskann-disk/src/search/provider/disk_provider.rs)); a future `u64` ID type would touch this API surface — accepted cost. - **Additional `GraphMode` variants in this change.** The enum is built for extension, but each new variant should land with its own consumer integration, tests, and (for traversal modifiers) recall data. -- **Newtyping `DiskProvider::ExternalId`.** The "predicate sees external IDs" contract (§Types) is documented but not type-enforced — both ID kinds are bare `u32` today. Making the distinction type-level (e.g. `struct ExternalId(u32);`) would catch accidental cross-space passing at compile time but touches the broader `DataProvider` surface beyond this RFC's scope. Deferred. ## Future Work -- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs. **Note**: keying the disk `Predicate` on `ExternalId` (§Types) already closes one corner of this divergence — both sides now address the caller's ID space. +- [ ] **Align disk and in-memory beta-filter behavior.** After this change, `GraphMode::BetaFilter` on the disk path applies *both* beta-biased traversal *and* a hard post-filter (non-matching IDs are dropped in `RerankAndFilter`). The in-memory `BetaFilter` strategy ([diskann-providers/.../betafilter.rs](../diskann-providers/src/model/graph/provider/layers/betafilter.rs)) only applies the beta bias — non-matching IDs can still appear in results. A user asking for "beta filter" gets different result sets depending on which side they're on. Future work: pick one semantics (most likely "bias + post-filter," matching the disk side) and align the in-memory strategy, or document the divergence explicitly in both APIs.