Profiling data around Apache Arrow

Designing a Data Profiler Around Apache Arrow: Lessons from dataprof

A design story of dataprof: why I built a profiler around Apache Arrow, how it changed the architecture, and how this journey led to contributions to arrow-rs’ Parquet reader.

February 5, 2026 · 11 min · 2316 words · Andrea Bozzo