Add relevance search function support for unified SQL query by dai-chen · Pull Request #5279 · opensearch-project/sql

dai-chen · 2026-03-27T21:33:37Z

Description

Register relevance search functions (match, match_phrase, multi_match, etc.) as language-level functions in the unified query engine, with backward-compatible named-argument syntax support (for now).

Function Support Matrix

Function	SQL V2 (Legacy)	PPL V3 (Calcite)	SQL (Calcite) — this PR
match	✅	✅	✅
match_phrase	✅	✅	✅
match_bool_prefix	✅	✅	✅
match_phrase_prefix	✅	✅	✅
multi_match	✅	✅	⚠️ single-field only
simple_query_string	✅	✅	⚠️ single-field only
query_string	✅	✅	⚠️ single-field only
query (no-field)	✅	❌	❌
wildcard_query	✅	❌	❌
score / scorequery	✅	❌	❌
match_query (alias)	✅	❌	❌
matchphrase (alias)	✅	❌	❌
matchquery (alias)	✅	❌	❌
multimatch (alias)	✅	❌	❌
matchphrasequery (alias)	✅	❌	❌
multimatchquery (alias)	✅	❌	❌

Design Choices

Language-level functions: Relevance functions are part of the language specification, not tied to any specific data source. They are defined declaratively in UnifiedFunctionSpec and registered globally, making them always resolvable in any query context. No local execution is provided; data-source capability is enforced at optimization time by pushdown rules.
Named argument rewriting: V2 SQL uses non-standard key=value syntax for optional parameters (e.g., match(name, 'John', operator='AND')) instead of SQL's standard => syntax. NamedArgRewriter runs as a pre-validation pass that normalizes both positional and key=value arguments into MAP-based form (e.g., MAP('field', name), MAP('operator', 'AND')), producing identical plans as the PPL path for shared pushdown rules.

Please find more design details in #5248 (comment).

Next Step

Review all breaking changes between V2 syntax and SQL standard together and decide whether to follow SQL standard or maintain backward compatibility via the current rewriting approach for this. Ref: [FEATURE] SQL query support for Analytics engine integration #5248 (comment)
Register V2 function aliases (matchquery, matchphrase, multimatch, wildcardquery, matchphrasequery, multimatchquery) that are not yet covered.
Multi-field bracket syntax (multi_match(['field1', 'field2'])) is not supported — only single column references are accepted. Documented with FIXME tests. Options: (1) extend the parser to support V2's bracket syntax; (2) switch to ARRAY[...] and accept the breaking change from V2.

Related Issues

Part of #5248

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2026-03-27T21:34:37Z

Failed to generate code suggestions for PR

github-actions · 2026-03-28T01:28:33Z

Failed to generate code suggestions for PR

github-actions · 2026-04-03T19:24:28Z

PR Reviewer Guide 🔍

(Review updated until commit `ac0fe64`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections api/src/test/java/org/opensearch/sql/api/UnifiedRelevanceSearchSqlTest.java [145]: Calcite SQL parser does not support V2 bracket field list syntax; multi-field relevance functions only accept single column reference.
🔀 Multiple PR themes Sub-PR theme: Register relevance functions in Calcite operator table via UnifiedFunctionSpec Relevant files: api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java Sub-PR theme: Add NamedArgRewriter pre-validation pass for backward-compatible named-argument syntax Relevant files: api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java api/src/test/java/org/opensearch/sql/api/parser/NamedArgRewriterTest.java Sub-PR theme: Add integration tests for relevance search functions in SQL and PPL planning paths Relevant files: api/src/test/java/org/opensearch/sql/api/UnifiedRelevanceSearchSqlTest.java api/src/test/java/org/opensearch/sql/api/UnifiedRelevanceSearchTest.java api/src/testFixtures/java/org/opensearch/sql/api/UnifiedQueryTestBase.java
⚡ Recommended focus areas for review Silent Rewrite Risk In `rewriteToMaps`, when a named argument uses an `EQUALS` node but the left-hand side is not a `SqlIdentifier`, the code falls back to `key.toString()`, which may produce unexpected or backtick-decorated key strings. The comment says this avoids backtick-decorated keys for reserved words, but `toString()` on non-identifier nodes could produce arbitrary SQL text as a map key, silently producing incorrect query plans. if (op instanceof SqlCall eq && op.getKind() == SqlKind.EQUALS) { SqlNode key = eq.operand(0); String name = key instanceof SqlIdentifier ident ? ident.getSimple() : key.toString(); // avoid backtick-decorated keys for reserved words maps[i] = toMap(name, eq.operand(1)); Bypassed Type Checking `VariadicOperandMetadata.checkOperandTypes` always returns `true`, completely bypassing Calcite's operand type validation. While this is intentional (with a CALCITE-5366 reference), it means invalid argument types (e.g., passing a numeric literal where a field reference is expected) will silently pass validation and only fail at pushdown time, potentially with confusing error messages. public boolean checkOperandTypes(SqlCallBinding binding, boolean throwOnFailure) { return true; // Bypass: CALCITE-5366 breaks optional argument type checking } Case Sensitivity `ALL_SPECS` lookup in `of()` uses `name.toLowerCase()`, but function names are stored using the original `funcName` (e.g., "match", "multi_match"). If Calcite passes an uppercase or mixed-case function name (depending on parser casing config), the lookup may fail. This should be verified against the `UNCHANGED` casing config used in `UnifiedQueryContext`. public static Optional<UnifiedFunctionSpec> of(String name) { return Optional.ofNullable(ALL_SPECS.get(name.toLowerCase())); } Hardcoded Category Only `UnifiedFunctionSpec.RELEVANCE` is chained into the operator table. If new categories are added to `UnifiedFunctionSpec` in the future, they must be manually added here. Consider iterating over all categories or exposing a combined operator table from `UnifiedFunctionSpec` directly. .operatorTable( SqlOperatorTables.chain( SqlStdOperatorTable.instance(), UnifiedFunctionSpec.RELEVANCE.operatorTable()))

github-actions · 2026-04-03T19:25:03Z

PR Code Suggestions ✨

Latest suggestions up to ac0fe64

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Prevent swallowing assertion errors in error helper The `catch (Exception e)` block will also catch the `AssertionError` thrown on the line above if `AssertionError` extends `Exception` — but in Java, `AssertionError` extends `Error`, not `Exception`, so this is safe. However, if `planner.plan()` throws an `Error` (not `Exception`), it will propagate uncaught and produce a confusing test failure. Consider catching `Throwable` or at minimum documenting this assumption. api/src/main/java/org/opensearch/sql/api/UnifiedQueryTestBase.java [153-160] protected QueryErrorAssert givenInvalidQuery(String query) { try { planner.plan(query); throw new AssertionError("Expected query to fail: " + query); + } catch (AssertionError ae) { + throw ae; // re-throw assertion failures from the line above } catch (Exception e) { return new QueryErrorAssert(e); } } Suggestion importance[1-10]: 4 __ Why: The suggestion correctly notes that `AssertionError` extends `Error` not `Exception`, so the current code is already safe. However, explicitly re-throwing `AssertionError` improves clarity and defensive correctness, making it a minor but valid improvement to test infrastructure.	Low
General	Explicitly allow unbounded optional arguments `SqlOperandCountRanges.from(n)` sets the minimum operand count to `n` but allows any number above it. However, the `paramNames` list only contains the required positional parameters (e.g., `["field", "query"]`), so the minimum is correctly set to 2. But since optional named arguments are also allowed, the upper bound should be explicitly set to unbounded (`-1`) to make the intent clear and avoid potential validation issues with Calcite's operand count checker. api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java [162-164] @Override public SqlOperandCountRange getOperandCountRange() { - return SqlOperandCountRanges.from(paramNames.size()); + return SqlOperandCountRanges.between(paramNames.size(), -1); } Suggestion importance[1-10]: 3 __ Why: `SqlOperandCountRanges.from(n)` already allows any number of operands above `n`, so the behavior is functionally equivalent. The suggestion to use `between(n, -1)` makes the intent more explicit but doesn't change runtime behavior, making it a minor readability improvement.	Low
General	Fix misleading comment and key extraction logic The comment says "avoid backtick-decorated keys for reserved words" but the logic is inverted: `getSimple()` is used for `SqlIdentifier` (which is the normal case), while `toString()` is used for non-identifier keys (which may produce backtick-decorated output for reserved words). The fallback `key.toString()` should be replaced with a safer extraction, or the comment should clarify that `toString()` is intentionally used for non-identifier key types only. Additionally, the `key instanceof` `SqlIdentifier ident` branch already handles the reserved-word case via `getSimple()`, so the comment is misleading. api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java [53-66] if (op instanceof SqlCall eq && op.getKind() == SqlKind.EQUALS) { SqlNode key = eq.operand(0); String name = key instanceof SqlIdentifier ident - ? ident.getSimple() - : key.toString(); // avoid backtick-decorated keys for reserved words + ? ident.getSimple() // use getSimple() to avoid backtick-decorated keys for reserved words + : key.toSqlString(null).getSql(); // fallback for non-identifier keys maps[i] = toMap(name, eq.operand(1)); } else { if (i >= paramNames.size()) { throw new IllegalArgumentException( String.format("Invalid arguments for function '%s'", call.getOperator().getName())); } maps[i] = toMap(paramNames.get(i), op); } Suggestion importance[1-10]: 2 __ Why: The comment placement is slightly misleading, but the actual logic is correct — `getSimple()` is used for `SqlIdentifier` which handles reserved words properly. The suggested `toSqlString(null)` replacement could introduce null-related issues and the improvement is marginal at best.	Low

Previous suggestions

Suggestions up to commit 6538c56

Category	Suggestion	Impact
Possible issue	Explicitly set unbounded maximum operand count for variadic functions `SqlOperandCountRanges.from(n)` sets the minimum operand count to `n` but allows any number above it. However, the `paramNames` list only contains the required positional parameters (e.g., `["field", "query"]` with size 2), so the minimum is correctly enforced. But since optional named arguments are also accepted, the maximum should be unbounded (`-1`). Using `from(n)` already achieves this, but it's worth verifying that `from` indeed sets max to `-1` (unbounded) in the Calcite version used — if it sets max equal to min, extra named args would be rejected at the operand count check before reaching `checkOperandTypes`. api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java [162-164] @Override public SqlOperandCountRange getOperandCountRange() { - return SqlOperandCountRanges.from(paramNames.size()); + // min = required positional params; max = unbounded to allow optional named args + return SqlOperandCountRanges.between(paramNames.size(), -1); } Suggestion importance[1-10]: 5 __ Why: `SqlOperandCountRanges.from(n)` in Calcite sets min=n and max=-1 (unbounded), so the current code is likely correct. However, using `between(paramNames.size(), -1)` makes the intent explicit and guards against any version-specific behavior differences. This is a valid defensive improvement with moderate impact.	Low
General	Move assertion outside catch to avoid accidental swallowing The `catch (Exception e)` block will also catch the `AssertionError` thrown on the line above if `AssertionError` extends `Exception` — but in Java, `AssertionError` extends `Error`, not `Exception`, so this is actually safe. However, if `planner.plan(query)` throws an `Error` (not `Exception`), it would propagate uncaught and produce a confusing test failure. This is a minor robustness concern, but the current logic is correct for the intended use case. api/src/test/java/org/opensearch/sql/api/UnifiedQueryTestBase.java [153-160] protected QueryErrorAssert givenInvalidQuery(String query) { try { planner.plan(query); - throw new AssertionError("Expected query to fail: " + query); } catch (Exception e) { return new QueryErrorAssert(e); } + throw new AssertionError("Expected query to fail: " + query); } Suggestion importance[1-10]: 4 __ Why: The improved code is a cleaner and more idiomatic pattern that avoids any risk of the `AssertionError` being caught. While `AssertionError` extends `Error` (not `Exception`) so the current code is technically safe, the refactored version is clearer and follows better Java testing practices.	Low
	Cache the operator table to avoid repeated allocation The operator table is rebuilt on every call to `buildFrameworkConfig()` by calling `UnifiedFunctionSpec.RELEVANCE.operatorTable()`, which creates a new `SqlOperatorTables.of(...)` instance each time. Since `RELEVANCE` is a static constant, its operator table should also be pre-computed and cached to avoid unnecessary object allocation on each framework config build. api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java [249-251] .operatorTable( SqlOperatorTables.chain( - SqlStdOperatorTable.instance(), UnifiedFunctionSpec.RELEVANCE.operatorTable())) + SqlStdOperatorTable.instance(), UnifiedFunctionSpec.RELEVANCE_OPERATOR_TABLE)) Suggestion importance[1-10]: 3 __ Why: The suggestion to cache `RELEVANCE.operatorTable()` is a valid minor optimization, but it requires adding a new static field `RELEVANCE_OPERATOR_TABLE` to `UnifiedFunctionSpec`, which is a non-trivial API change. The performance impact is minimal since `buildFrameworkConfig()` is not called in a hot path.	Low
	Fix misleading comment and fallback key extraction logic The comment says "avoid backtick-decorated keys for reserved words" but the logic is inverted: `getSimple()` is used for `SqlIdentifier` (which is the normal case), while `toString()` is used for non-identifier keys (which would be the reserved-word case). For reserved words, the key is still parsed as a `SqlIdentifier` (possibly quoted), so `getSimple()` is correct for identifiers. However, the fallback `key.toString()` for non-identifier keys may produce unexpected output. The comment is misleading and the logic should be clarified or corrected to ensure reserved words are handled properly. api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java [53-66] if (op instanceof SqlCall eq && op.getKind() == SqlKind.EQUALS) { SqlNode key = eq.operand(0); + // Use getSimple() for identifiers (handles both plain and quoted/reserved-word keys) String name = key instanceof SqlIdentifier ident ? ident.getSimple() - : key.toString(); // avoid backtick-decorated keys for reserved words + : key.toSqlString(null).getSql(); maps[i] = toMap(name, eq.operand(1)); } else { if (i >= paramNames.size()) { throw new IllegalArgumentException( String.format("Invalid arguments for function '%s'", call.getOperator().getName())); } maps[i] = toMap(paramNames.get(i), op); } Suggestion importance[1-10]: 2 __ Why: The comment about "backtick-decorated keys for reserved words" is indeed misleading, but the actual logic is correct — `getSimple()` on `SqlIdentifier` handles quoted/reserved-word identifiers properly. The suggested `toSqlString(null)` fallback is not clearly better and may introduce other issues. This is a minor clarity concern with low impact.	Low

Suggestions up to commit 938d603

Category	Suggestion	Impact
Possible issue	Fix named-arg key extraction for quoted identifiers Using `toString()` on a `SqlNode` operand (the key in a named argument like `operator='AND'`) may produce a quoted or fully-qualified identifier string rather than the plain parameter name. For example, a quoted identifier `"operator"` would produce `"operator"` instead of `operator`. Use `SqlIdentifier`'s `getSimple()` or strip quotes explicitly to get the raw name. api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java [52-56] if (op instanceof SqlCall eq && op.getKind() == SqlKind.EQUALS) { - maps[i] = toMap(eq.operand(0).toString(), eq.operand(1)); + SqlNode keyNode = eq.operand(0); + String key = keyNode instanceof SqlIdentifier id ? id.getSimple() : keyNode.toString(); + maps[i] = toMap(key, eq.operand(1)); } else { maps[i] = toMap(paramNames.get(i), op); } Suggestion importance[1-10]: 7 __ Why: Using `toString()` on a `SqlNode` for the key in named arguments like `operator='AND'` could produce quoted strings (e.g., `"operator"`) instead of the plain name `operator`. Using `SqlIdentifier.getSimple()` would be more robust and correct for identifier nodes.	Medium
Possible issue	Guard against null exception messages in assertions `cause.getMessage()` can return `null` for some exception types, which would cause a `NullPointerException` in the `contains` call and obscure the actual test failure. Add a null check or use `String.valueOf(cause.getMessage())` to safely handle null messages. api/src/testFixtures/java/org/opensearch/sql/api/UnifiedQueryTestBase.java [171-180] public QueryErrorAssert assertErrorMessage(String expected) { Throwable cause = error; while (cause.getCause() != null) { cause = cause.getCause(); } + String actualMessage = String.valueOf(cause.getMessage()); assertTrue( - "Expected error to contain: " + expected + "\nActual: " + cause.getMessage(), - cause.getMessage().contains(expected)); + "Expected error to contain: " + expected + "\nActual: " + actualMessage, + actualMessage.contains(expected)); return this; } Suggestion importance[1-10]: 5 __ Why: `cause.getMessage()` can return `null`, causing a `NullPointerException` that would obscure the actual test failure. Using `String.valueOf()` is a simple and safe fix for test utility code.	Low
General	Clarify variadic operand count range intent `SqlOperandCountRanges.from(n)` sets the minimum operand count to `n` but allows any number of additional arguments (unbounded max). Since `paramNames` only lists the required positional parameters (e.g., `["field", "query"]`), this correctly enforces a minimum of 2, but it also means calls with fewer than 2 args pass the count check silently. The intent is to allow variadic optional args beyond the required ones, so the minimum should be `paramNames.size()` and max should be unbounded (`-1`), which `SqlOperandCountRanges.from` already does — however, verify this is the intended behavior and add a comment to clarify. api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java [162-164] @Override public SqlOperandCountRange getOperandCountRange() { + // Minimum: required positional params; maximum: unbounded (optional named args allowed) return SqlOperandCountRanges.from(paramNames.size()); } Suggestion importance[1-10]: 1 __ Why: The suggestion only asks to add a comment to clarify existing behavior without changing any logic. The `existing_code` and `improved_code` are functionally identical, making this a documentation-only suggestion with minimal impact.	Low

Suggestions up to commit acade43

Category	Suggestion	Impact
Possible issue	Guard against out-of-bounds positional argument access When a positional argument index `i` exceeds the size of `paramNames`, `paramNames.get(i)` will throw an `IndexOutOfBoundsException`. Since `paramNames` only contains the required parameter names (e.g., `["field", "query"]`), any extra positional arguments beyond those will cause a runtime crash. Add a bounds check or handle this case gracefully. api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java [52-56] if (op instanceof SqlCall eq && op.getKind() == SqlKind.EQUALS) { maps[i] = toMap(eq.operand(0).toString(), eq.operand(1)); } else { + if (i >= paramNames.size()) { + throw new IllegalArgumentException( + "Too many positional arguments for function " + call.getOperator().getName() + + ": expected at most " + paramNames.size() + " positional args"); + } maps[i] = toMap(paramNames.get(i), op); } Suggestion importance[1-10]: 6 __ Why: The `paramNames.get(i)` call will throw an `IndexOutOfBoundsException` if more positional arguments are passed than there are entries in `paramNames`. This is a valid edge case that could cause a confusing runtime crash rather than a meaningful error message.	Low
Possible issue	Prevent NullPointerException on null error message If the root cause's `getMessage()` returns `null`, calling `cause.getMessage().contains(expected)` will throw a `NullPointerException`, masking the original error. Add a null check before calling `contains`. api/src/testFixtures/java/org/opensearch/sql/api/UnifiedQueryTestBase.java [172-179] Throwable cause = error; while (cause.getCause() != null) { cause = cause.getCause(); } +String actualMessage = cause.getMessage(); +assertNotNull("Expected error message but got null for cause: " + cause, actualMessage); assertTrue( - "Expected error to contain: " + expected + "\nActual: " + cause.getMessage(), - cause.getMessage().contains(expected)); + "Expected error to contain: " + expected + "\nActual: " + actualMessage, + actualMessage.contains(expected)); Suggestion importance[1-10]: 4 __ Why: A null `getMessage()` on the root cause would throw a `NullPointerException`, masking the original error. Adding a null check improves robustness of the test helper, though this is a test utility and the impact is limited to test diagnostics.	Low
General	Clarify minimum operand count enforcement for variadic functions `SqlOperandCountRanges.from(n)` sets the minimum operand count to `n` but allows any number of additional arguments. However, the `paramNames` list only contains required parameter names (e.g., `["field", "query"]`), so the minimum should be enforced correctly. The issue is that this allows zero arguments when `paramNames` is empty, and doesn't enforce the minimum required args. Consider using `SqlOperandCountRanges.from(paramNames.size())` only if that correctly represents the minimum — but verify that the intent is "at least N args" and not "exactly N args". api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java [162-164] @Override public SqlOperandCountRange getOperandCountRange() { + // Requires at least the mandatory positional params; additional named args are variadic return SqlOperandCountRanges.from(paramNames.size()); } Suggestion importance[1-10]: 1 __ Why: The `improved_code` is functionally identical to the `existing_code` — it only adds a comment. This suggestion doesn't actually change behavior and only asks the user to verify the intent, which warrants a very low score.	Low

Suggestions up to commit c578e47

Category	Suggestion	Impact
Possible issue	Guard against out-of-bounds positional argument index When `index` is greater than or equal to `paramNames.size()`, calling `paramNames.get(index)` will throw an `IndexOutOfBoundsException`. This can happen if a caller passes more positional arguments than the number of declared param names. Add a bounds check and throw a descriptive error or handle the extra positional arguments gracefully. api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java [62-70] private static SqlNode rewriteOperand(SqlNode operand, int index, List<String> paramNames) { if (operand.getKind() == SqlKind.EQUALS) { // Optional: operator='AND' → MAP('operator', 'AND') SqlCall eq = (SqlCall) operand; return toMap(eq.operand(0).toString(), eq.operand(1)); } // Required positional: name → MAP('field', name) + if (index >= paramNames.size()) { + throw new IllegalArgumentException( + "Too many positional arguments: index " + index + " exceeds declared params " + paramNames); + } return toMap(paramNames.get(index), operand); } Suggestion importance[1-10]: 5 __ Why: The `paramNames.get(index)` call can throw `IndexOutOfBoundsException` if more positional arguments are passed than declared param names. However, in practice the variadic functions defined in `UnifiedFunctionSpec` use `SqlOperandCountRanges.from(paramNames.size())` which enforces a minimum, and extra args would be `EQUALS` kind (named args), so the risk is limited but still worth guarding.	Low
Possible issue	Prevent NullPointerException on null function name lookup If `name` is `null`, calling `name.toLowerCase()` will throw a `NullPointerException`. Since this method is called from `NamedArgRewriter.visit()` using `visited.getOperator().getName()`, which could potentially return null, a null check should be added. api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java [76-78] public static UnifiedFunctionSpec of(String name) { + if (name == null) return null; return ALL_SPECS.get(name.toLowerCase()); } Suggestion importance[1-10]: 3 __ Why: While `SqlOperator.getName()` is unlikely to return `null` in practice, adding a null guard is a minor defensive improvement. The impact is low since Calcite operators always have non-null names.	Low
General	Guard against null error message in assertion If the root cause's `getMessage()` returns `null`, calling `cause.getMessage().contains(expected)` will throw a `NullPointerException`, causing a confusing test failure. Add a null check on `cause.getMessage()` before calling `contains`. api/src/testFixtures/java/org/opensearch/sql/api/UnifiedQueryTestBase.java [171-180] public QueryErrorAssert assertErrorMessage(String expected) { Throwable cause = error; while (cause.getCause() != null) { cause = cause.getCause(); } + String message = cause.getMessage(); assertTrue( - "Expected error to contain: " + expected + "\nActual: " + cause.getMessage(), - cause.getMessage().contains(expected)); + "Expected error to contain: " + expected + "\nActual: " + message, + message != null && message.contains(expected)); return this; } Suggestion importance[1-10]: 4 __ Why: If `cause.getMessage()` returns `null`, the `contains` call would throw a `NullPointerException` instead of a clear assertion failure. The fix improves test robustness by handling null messages gracefully.	Low

Suggestions up to commit fe94ab1

Category	Suggestion	Impact
Possible issue	Guard against out-of-bounds positional argument index When `index` is greater than or equal to `paramNames.size()`, calling `paramNames.get(index)` will throw an `IndexOutOfBoundsException`. This can happen when a caller passes more positional arguments than there are declared param names. Add a bounds check and throw a descriptive error or handle the extra positional args gracefully. api/src/main/java/org/opensearch/sql/api/parser/NamedArgRewriter.java [61-69] private static SqlNode rewriteOperand(SqlNode operand, int index, List<String> paramNames) { if (operand.getKind() == SqlKind.EQUALS) { // Optional: operator='AND' → MAP('operator', 'AND') SqlCall eq = (SqlCall) operand; return toMap(eq.operand(0).toString(), eq.operand(1)); } // Required positional: name → MAP('field', name) + if (index >= paramNames.size()) { + throw new IllegalArgumentException( + "Too many positional arguments: index " + index + " exceeds declared params " + paramNames); + } return toMap(paramNames.get(index), operand); } Suggestion importance[1-10]: 6 __ Why: The `paramNames.get(index)` call will throw an `IndexOutOfBoundsException` if more positional arguments are passed than declared param names. However, in practice the variadic functions defined in `UnifiedFunctionSpec` only have 2 required params and extra args are expected to be `EQUALS` nodes, so this is a minor edge case worth guarding.	Low
Possible issue	Null-check function name before lookup If `name` is `null`, calling `name.toLowerCase()` will throw a `NullPointerException`. Since `of` is called from `NamedArgRewriter.visit` with `visited.getOperator().getName()`, which could potentially return null, a null check should be added. api/src/main/java/org/opensearch/sql/api/spec/UnifiedFunctionSpec.java [80-82] public static UnifiedFunctionSpec of(String name) { + if (name == null) return null; return ALL_SPECS.get(name.toLowerCase()); } Suggestion importance[1-10]: 3 __ Why: While theoretically `getName()` could return null, Calcite operators always have non-null names in practice, making this a very low-probability defensive check with minimal real-world impact.	Low
General	Guard against null error message in assertion If the root cause's `getMessage()` returns `null`, calling `cause.getMessage().contains(expected)` will throw a `NullPointerException`. Add a null check on `cause.getMessage()` before calling `contains`. api/src/testFixtures/java/org/opensearch/sql/api/UnifiedQueryTestBase.java [171-180] public QueryErrorAssert assertErrorMessage(String expected) { Throwable cause = error; while (cause.getCause() != null) { cause = cause.getCause(); } + String message = cause.getMessage(); assertTrue( - "Expected error to contain: " + expected + "\nActual: " + cause.getMessage(), - cause.getMessage().contains(expected)); + "Expected error to contain: " + expected + "\nActual: " + message, + message != null && message.contains(expected)); return this; } Suggestion importance[1-10]: 4 __ Why: The `cause.getMessage()` could return null for some exception types, causing a `NullPointerException` in the test assertion. The fix correctly handles this by storing the message in a variable and adding a null check before calling `contains`.	Low

Add UnifiedFunctionSpec with fluent builder to define relevance function signatures (match, match_phrase, multi_match, etc.) as a composable SqlOperatorTable chained into Calcite's FrameworkConfig. Functions are language-level primitives, always resolvable regardless of default schema. Add SQL and PPL test coverage for all 7 relevance functions. Signed-off-by: Chen Dai <daichen@amazon.com>

github-actions · 2026-04-03T19:36:27Z

Persistent review updated to latest commit a1f2d7f

github-actions · 2026-04-03T20:27:51Z

Persistent review updated to latest commit ae36dc1

github-actions · 2026-04-03T20:59:24Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 1908ebe.

Path	Line	Severity	Description
build.log	1	medium	A build.log file containing local developer machine details (username 'daichen', local file paths '/Users/daichen/IdeaProjects/my-sql-repo/', OS version, JDK paths) has been committed to the repository. This leaks developer environment information and is unrelated to the feature being implemented. Likely accidental inclusion rather than malicious, but warrants review.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 1 | Low: 0

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-04-03T21:00:17Z

Persistent review updated to latest commit 1908ebe

github-actions · 2026-04-03T21:03:39Z

Persistent review updated to latest commit fe94ab1

github-actions · 2026-04-03T21:33:02Z

Persistent review updated to latest commit c578e47

Add NamedArgRewriter SqlShuttle that normalizes V2/PPL relevance syntax into MAP-based form before Calcite validation. Transforms positional and key=value arguments into MAP[paramName, value] pairs matching PPL's internal representation for uniform pushdown rules. Refactor UnifiedFunctionSpec to instance-based design with fluent builder and Category record for grouping. Use SqlUserDefinedFunction for consistency with PPL path. Add error tests and QueryErrorAssert to test base. Signed-off-by: Chen Dai <daichen@amazon.com>

github-actions · 2026-04-03T22:40:02Z

Persistent review updated to latest commit acade43

Add dedicated NamedArgRewriterTest covering: - Positional args rewritten to MAPs with correct param names - V2 equals syntax (key=value) flattened to MAP entries - Multi-field functions use 'fields' param name - Non-relevance functions pass through unchanged - Edge cases: all-equals args, mixed order, extra positional args Add high-level regression test in UnifiedRelevanceSearchSqlTest verifying non-relevance functions (upper) are unaffected by rewriter. Parser config matches production (Casing.UNCHANGED). Signed-off-by: Chen Dai <daichen@amazon.com>

github-actions · 2026-04-06T18:04:56Z

Persistent review updated to latest commit 938d603

github-actions · 2026-04-06T18:56:32Z

Persistent review updated to latest commit 6538c56

- Add bounds check for positional args with descriptive error message - Use SqlIdentifier.getSimple() to avoid backtick-decorated keys for reserved words like 'escape' in named arguments - Add null guard in QueryErrorAssert.assertErrorMessage to prevent NPE when root cause exception has null message - Update tests to assert on IllegalArgumentException with error message - Add test for reserved word as named argument key (query_string escape) Signed-off-by: Chen Dai <daichen@amazon.com>

github-actions · 2026-04-06T19:04:58Z

Persistent review updated to latest commit ac0fe64

dai-chen self-assigned this Mar 27, 2026

dai-chen added enhancement New feature or request SQL labels Mar 27, 2026

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from 8e54956 to 5f80dfc Compare March 28, 2026 01:27

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from 5f80dfc to 0da8394 Compare April 3, 2026 19:23

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from 0da8394 to a1f2d7f Compare April 3, 2026 19:34

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from a1f2d7f to ae36dc1 Compare April 3, 2026 20:26

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from ae36dc1 to 1908ebe Compare April 3, 2026 20:58

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from 1908ebe to fe94ab1 Compare April 3, 2026 21:02

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from fe94ab1 to c578e47 Compare April 3, 2026 21:31

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from c578e47 to acade43 Compare April 3, 2026 22:38

dai-chen marked this pull request as ready for review April 4, 2026 00:00

dai-chen requested review from Swiddis, joshuali925, mengweieric, ps48 and vamsimanohar as code owners April 4, 2026 00:00

dai-chen requested review from LantaoJin, RyanL1997, acarbonetto, ahkcs, anirudha, noCharger, penghuo, qianheng-aws, songkant-aws, ykmr1224 and yuancu as code owners April 4, 2026 00:00

dai-chen force-pushed the feature/register-fulltext-search-calcite-sql branch from 6538c56 to ac0fe64 Compare April 6, 2026 19:03

mengweieric approved these changes Apr 6, 2026

View reviewed changes

ahkcs approved these changes Apr 6, 2026

View reviewed changes

dai-chen merged commit 5b792da into opensearch-project:main Apr 7, 2026
38 checks passed

dai-chen mentioned this pull request Apr 8, 2026

[FEATURE] SQL query support for Analytics engine integration #5248

Open

mengweieric mentioned this pull request Apr 8, 2026

[Feature] Add vector search execution pipeline for vectorSearch() table function #5320

Merged

dai-chen mentioned this pull request Apr 14, 2026

[FEATURE] Unified SQL language across OpenSearch, Flint and Spark SQL #5346

Open

Conversation

dai-chen commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Function Support Matrix

Design Choices

Next Step

Related Issues

Check List

Uh oh!

github-actions Bot commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 28, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit ac0fe64)

Uh oh!

github-actions Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dai-chen commented Mar 27, 2026 •

edited

Loading

github-actions Bot commented Apr 3, 2026 •

edited

Loading

(Review updated until commit `ac0fe64`)

github-actions Bot commented Apr 3, 2026 •

edited

Loading