Skip to content

Compile native Rust benchmarks during check#22336

Open
krugerLucas wants to merge 1 commit into
opensearch-project:mainfrom
krugerLucas:fix/compile-native-rust-benchmarks
Open

Compile native Rust benchmarks during check#22336
krugerLucas wants to merge 1 commit into
opensearch-project:mainfrom
krugerLucas:fix/compile-native-rust-benchmarks

Conversation

@krugerLucas

Copy link
Copy Markdown
Contributor

Description

Adds Gradle verification coverage for native Rust benchmark targets so check compiles them without running performance benchmarks.

The native bridge build already compiles opensearch-native-lib, but cargo build -p opensearch-native-lib does not compile benchmark targets. That allowed files under benches/ to drift when Rust APIs changed without being caught by CI.

This change:

  • Adds cargoBenchCheckDataFusion and cargoBenchCheckParquet in sandbox/libs:dataformat-native.
  • Runs cargo bench -p ... --no-run for opensearch-datafusion and opensearch-parquet-format.
  • Wires both tasks into check, keeping Gradle as the single build/check orchestration layer used locally and by CI.
  • Uses one task per Rust crate with benchmarks, because Cargo discovers benchmark targets at the package level and these are the only workspace crates with benchmark targets today.
  • Declares Rust source, benchmark, manifest, and lockfile inputs plus per-build-type stamp outputs, preserving Gradle incremental behavior and avoiding debug/release stamp reuse.
  • Uses Cargo's dev profile when -PrustDebug is set, matching the sandbox workflow's debug Rust mode while keeping this verification step focused on compilation.
  • Updates existing DataFusion benchmark sources to match current internal APIs while preserving their benchmark scenarios.

The benchmark source updates intentionally use neutral/default values for newly required API parameters:

  • DedicatedExecutor::new(..., 4) matches the benchmark's existing 4 worker thread executor setup.
  • New execute_query optional parameters are passed as None, empty slices, and InternalSearch::Off, so the existing query benchmark path remains unchanged.
  • DataFusionRuntime::new_for_bench(...) uses the benchmark-specific runtime constructor instead of manually constructing runtime internals.
  • New index/table fields are set to empty/default values, so sort metadata, prune config, and cancellation behavior are not enabled by this change.

Validation:

git diff --check
rustfmt --edition 2021 --check \
  sandbox/plugins/analytics-backend-datafusion/rust/benches/cross_rt_throughput_bench.rs \
  sandbox/plugins/analytics-backend-datafusion/rust/benches/query_bench.rs \
  sandbox/plugins/analytics-backend-datafusion/rust/benches/row_id_bench.rs
./gradlew :sandbox:libs:dataformat-native:check --dry-run \
  -Dsandbox.enabled=true \
  -PrustDebug
./gradlew \
  :sandbox:libs:dataformat-native:cargoBenchCheckParquet \
  :sandbox:libs:dataformat-native:cargoBenchCheckDataFusion \
  -Dsandbox.enabled=true \
  -PrustDebug \
  --rerun-tasks

Related Issues

Resolves #21538

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable. Not applicable; this does not change public APIs.
  • Public documentation issue/PR created, if applicable. Not applicable; this is a build validation change.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Lucas Kruger <88447897+krugerLucas@users.noreply.github.com>
@krugerLucas krugerLucas requested a review from a team as a code owner June 27, 2026 21:35
@github-actions github-actions Bot added bug Something isn't working Build Build Tasks/Gradle Plugin, groovy scripts, build tools, Javadoc enforcement. labels Jun 27, 2026
@github-actions

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

@github-actions

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Avoid leaking RuntimeManager via mem::forget

Using std::mem::forget(mgr) after shutdown() leaks the RuntimeManager and any
resources (tempdir, runtimes) it owns, which can cause incorrect benchmark teardown
and missed file cleanup of the tempfile::TempDir returned from setup(). Drop mgr
normally (and the tmpdir) so cleanup runs after benchmarking.

sandbox/plugins/analytics-backend-datafusion/rust/benches/query_bench.rs [140-141]

 mgr.cpu_executor.shutdown();
-std::mem::forget(mgr);
Suggestion importance[1-10]: 3

__

Why: The std::mem::forget(mgr) call existed in the original code and is not introduced by this PR, so it's outside the PR's scope. The concern about resource cleanup may be valid but requires more context on why forget was originally used.

Low
Remove unused parameter instead of underscoring

Renaming the df parameter to _df suppresses the unused-variable warning but
preserves a likely-unused argument across all call sites. If df is truly unneeded,
remove it from the signature and all callers to avoid dead parameters; if it is
needed, restore its usage rather than masking it.

sandbox/plugins/analytics-backend-datafusion/rust/benches/query_bench.rs [55]

-fn get_substrait(mgr: &RuntimeManager, _df: &DataFusionRuntime, dir: &str, sql: &str) -> Vec<u8> {
+fn get_substrait(mgr: &RuntimeManager, dir: &str, sql: &str) -> Vec<u8> {
Suggestion importance[1-10]: 2

__

Why: Minor stylistic suggestion; underscoring the parameter is a common Rust idiom and removing it would require changes to all call sites, providing marginal benefit.

Low

@github-actions

Copy link
Copy Markdown
Contributor

✅ Gradle check result for 4464bca: SUCCESS

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.38%. Comparing base (c153e67) to head (4464bca).

Additional details and impacted files
@@            Coverage Diff            @@
##               main   #22336   +/-   ##
=========================================
  Coverage     73.37%   73.38%           
+ Complexity    76063    76060    -3     
=========================================
  Files          6076     6076           
  Lines        345517   345517           
  Branches      49733    49733           
=========================================
+ Hits         253528   253561   +33     
+ Misses        71792    71701   -91     
- Partials      20197    20255   +58     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@krugerLucas

Copy link
Copy Markdown
Contributor Author

@reta Could you please rerun the failed sandbox-check when you have a chance? gradle-check passed and the new benchmark compile tasks ran; the failure is AnalyticsQueryTaskCleanupIT.testQtfFragmentCancelTearsDownAndCleansUp, which looks unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Build Build Tasks/Gradle Plugin, groovy scripts, build tools, Javadoc enforcement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] benchmarks not getting compiled as part of native builds

1 participant