Skip to content

SQL-to-SQL translation falls back to post-processing for valid CubeSQL patterns (chained CTEs, multi-cube joins) #10591

@andrew-oh-spare

Description

@andrew-oh-spare

Description

When using the /v1/sql translation endpoint (with format=sql) to convert CubeSQL into native data-source SQL, certain query patterns that are valid CubeSQL fall back to post-processing instead of being pushed down.

The /v1/sql endpoint (gateway.tssql4sql()) returns the native result wrapped as { sql: result }. The native Rust layer (sql4sql.rsSql4SqlResponse::to_js) serializes successful pushdown/regular results with result.sql as a [sqlString, values] tuple array and result.status = "ok". For post-processing fallbacks, result.sql is absent — only result.status = "error" and result.error are set.

This means consumers must check Array.isArray(response.sql.sql) to determine whether usable SQL was returned. When this is false, the query cannot be executed against the underlying data source — it would require Cube's in-memory engine. For use cases where the translated SQL is sent directly to the data source (e.g., ClickHouse), post-processing results are unusable.

Cube version: 1.6.6
Data source: ClickHouse

Issue 1: CTE referencing another CTE triggers post-processing

A WITH clause where one CTE references a sibling CTE cannot be pushed down.

Minimal reproduction

WITH base_counts AS (
  SELECT
    some_dimension AS dim,
    some_category AS category,
    COUNT(*) AS cnt
  FROM my_cube
  WHERE status = 'completed'
  GROUP BY 1, 2
),
dim_totals AS (
  SELECT
    dim,
    SUM(cnt) AS total
  FROM base_counts        -- ← references sibling CTE
  GROUP BY dim
)
SELECT
  bc.dim,
  bc.category,
  ROUND(100.0 * bc.cnt / dt.total, 2) AS percentage
FROM base_counts bc
JOIN dim_totals dt ON bc.dim = dt.dim
ORDER BY 1, 2

Expected: response.sql.status === "ok" with response.sql.sql containing a [sqlString, values] tuple that can be sent to ClickHouse.
Actual: response.sql.status === "error", response.sql.sql is absent, response.sql.query_type === "post_processing", response.sql.error === "Provided query can not be executed without post-processing."

Issue 2: Multi-cube __cubeJoinField joins + EXTRACT(EPOCH FROM ...) fail pushdown

A query joining multiple cubes via Cube's own __cubeJoinField syntax combined with EXTRACT(EPOCH FROM ...) cannot be pushed down to ClickHouse.

Minimal reproduction

SELECT
  cube_a.user_id,
  cube_b.first_name,
  cube_c.start_time,
  cube_c.end_time,
  (EXTRACT(EPOCH FROM cube_c.end_time) - EXTRACT(EPOCH FROM cube_c.start_time)) / 60 AS duration_minutes
FROM cube_a
LEFT JOIN cube_b ON cube_b.__cubeJoinField = cube_a.__cubeJoinField
LEFT JOIN cube_c ON cube_c.__cubeJoinField = cube_a.__cubeJoinField
WHERE cube_c.status = 'active'
ORDER BY 1

Expected: response.sql.status === "ok" — Cube resolves the __cubeJoinField joins and translates EXTRACT(EPOCH FROM ...) to ClickHouse-compatible SQL (e.g., toUnixTimestamp()).
Actual: response.sql.status === "error", response.sql.sql is absent, response.sql.query_type === "post_processing".

Summary

Pattern CubeSQL feature? Pushes down? Blocker
CTE referencing sibling CTE Standard SQL No Subquery node not handled in generate_sql_for_node
__cubeJoinField with 3+ cubes Cube-specific syntax No Multi-way join rewrite doesn't collapse
EXTRACT(EPOCH FROM ...) on ClickHouse Standard SQL No Missing expressions/extract template in ClickHouseQuery

The __cubeJoinField pattern is especially notable — it is Cube's own virtual join syntax, so users would reasonably expect the /v1/sql translator to handle it. Similarly, CTEs referencing sibling CTEs are standard SQL that most databases support natively.

Environment

  • Cube version: 1.6.6
  • Data source: ClickHouse
  • Endpoint: /v1/sql with format=sql

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions