Why it exists
Query engines become more reusable when storage formats can be inspected directly instead of only through the original serving system. Druid segments contain years of useful analytical engineering, but they are usually approached through a running Druid cluster; this project asks what becomes possible when those segment files can be read directly by a Rust query stack.
Technical center
The library reads Druid segment structures such as index.dr data and maps them into DataFusion table and execution abstractions, aiming for vectorized Arrow-friendly execution where possible. The hard part is translating storage internals, dictionaries, metrics, and bitmap-oriented access patterns into DataFusion abstractions without pretending the original format was designed as a simple Arrow file.
Current proof points
Even as a work in progress, the evaluation surface is already concrete: a dedicated segment parser layer, DataFusion `TableProvider` and `ExecutionPlan` adapters, Wikipedia segment fixtures, and integration tests around offline segment access rather than hand-wavy architectural claims. The current value is exploratory but specific: each parser and adapter clarifies how much of Druid's segment model can be exposed through modern Arrow and DataFusion interfaces.