Skip to content

[VARIANT] Fix "variant is null" data skipping when pushVariantIntoScan = true#30

Open
harshmotw-db wants to merge 7 commits into
allisonport-db:support-spark-4-1from
harshmotw-db:harshmotw-db/variant_test_fixes
Open

[VARIANT] Fix "variant is null" data skipping when pushVariantIntoScan = true#30
harshmotw-db wants to merge 7 commits into
allisonport-db:support-spark-4-1from
harshmotw-db:harshmotw-db/variant_test_fixes

Conversation

@harshmotw-db

@harshmotw-db harshmotw-db commented Dec 12, 2025

Copy link
Copy Markdown

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

The PR intended to compile Delta-Spark with Spark 4.1 was failing on some variant data skipping related tests. This was because in Spark 4.1, shredding-related configs were enabled by default, and

  1. Delta data skipping on variant data no longer worked properly because the read schema of variant was transformed to struct instead.
  2. The test infrastructure assumed that the spark plan for these data skipping test cases would have a filter node at the top. This is not true when pushVariantIntoScan = true which inserts a project node at the top.

This PR fixes these issues by:

  1. Treating variant structs as any other atomic type for the purpose of SkippingEligibleColumn in DataSkippingReader.scala.
  2. Modifying the test infrastructure so that it looks for the first filter node in the plan rather than blindly assuming that the node at the top of the plan is a filter node.

How was this patch tested?

Modified failing test to run with both - pushVariantIntoScan = true and false

Does this PR introduce any user-facing changes?

No. This PR just makes existing behavior compatible with Spark 4.1 which has shredded reads enabled by default.

@harshmotw-db harshmotw-db changed the title Harshmotw db/variant test fixes [VARIANT] Fix "variant is null" data skipping when pushVariantIntoScan = true Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant