Andrea Bozzo | Blog

👋 Welcome to my technical blog!

I write about Data Engineering, Rust, Go, Python and open source technologies.

Exploring lakehouse architectures, real-time streaming and the modern data world.

Profiling data around Apache Arrow

Designing a Data Profiler Around Apache Arrow: Lessons from dataprof

A design story of dataprof: why I built a profiler around Apache Arrow, how it changed the architecture, and how this journey led to contributions to arrow-rs’ Parquet reader.

February 5, 2026 Â· 11 min Â· 2316 words Â· Andrea Bozzo
Async in Python and Rust

Async in Python and Rust: Two Worlds, One Keyword

A technical exploration of async/await in Python and Rust: how the same syntax hides completely different execution models, with practical examples from contributions to Tokio and Python projects.

January 22, 2026 Â· 13 min Â· 2661 words Â· Andrea Bozzo
Mosaico Logo

Mosaico: The Data Platform for Robotics and Physical AI Written in Rust

A deep dive into Mosaico, the robotics data platform written in Rust: client-server architecture, semantic ontologies, data-oriented debugging, my journey within it, and the integration with Data Contract Engine.

January 6, 2026 Â· 18 min Â· 3726 words Â· Andrea Bozzo
Ceres Logo

Ceres: Semantic Search for Open Data

Ceres is a semantic search engine for CKAN portals. Built in Rust with Tokio and PostgreSQL+pgvector, it bridges the gap between how people search and how public administrations name their datasets.

December 20, 2025 Â· 7 min Â· 1454 words Â· Andrea Bozzo
Polars - Extremely Fast DataFrames

Closing the Rust Circle: High-Performance Data Analysis with Polars

Polars completes the Rust data engineering ecosystem: lazy evaluation, Apache Arrow, and native Iceberg V3 integration for performant analytics that compete with distributed clusters. The third pillar of the RisingWave + Lakekeeper + Polars stack.

December 3, 2025 Â· 25 min Â· 5177 words Â· Andrea Bozzo
Lakekeeper Architecture Overview

Lakekeeper: The Apache Iceberg REST Catalog Written in Rust

An exploration of Lakekeeper, the Iceberg REST catalog in Rust that completes the data engineering ecosystem: enterprise security with vended credentials, multi-tenancy, and RisingWave integration to build streaming lakehouses without JVM

November 22, 2025 Â· 10 min Â· 2019 words Â· Andrea Bozzo
RisingWave Architecture

RisingWave and Iceberg-Rust: When Real-Time Streaming Meets Modern Data Lake

The partnership between RisingWave and Iceberg-Rust represents a window into where modern data engineering is heading: real-time CDC streaming, intelligent hybrid delete strategy, and a performant Rust ecosystem challenging JVM dominance.

November 10, 2025 Â· 8 min Â· 1656 words Â· Andrea Bozzo