You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the parent tracking issue for the Google Summer of Code (GSoC) 2026 project to design and implement an Apache Arrow backend for PODIO.
The goal is to translate YAML-defined Event Data Models (EDM) into Arrow's columnar in-memory format, enabling language-independent, zero-copy data access, and seamless serialization to industry-standard formats like Parquet. This will facilitate high-speed streaming readout and reconstruction frameworks like EICrecon for future experiments.
Overview
This is the parent tracking issue for the Google Summer of Code (GSoC) 2026 project to design and implement an Apache Arrow backend for PODIO.
The goal is to translate YAML-defined Event Data Models (EDM) into Arrow's columnar in-memory format, enabling language-independent, zero-copy data access, and seamless serialization to industry-standard formats like Parquet. This will facilitate high-speed streaming readout and reconstruction frameworks like EICrecon for future experiments.
Roadmap & Deliverables
ArrowTypeRegistry).ArrowWriterin-memory conversion loop usingarrow::Buffer::Wrapfor flat wrapping.arrow::ListView(zero-copy) andarrow::ListArray(normalization).ArrowReaderto deserialize Arrow Tables back to PODIO Frames.arrow::datasetand the Parquet C++ writer to persist tables as self-describing Parquet files.