← Back

Andrea Bozzo

Data Engineer | Open Data Advocate | Analytics Pipeline Architect

"In Data We Trust, In Backups We Believe"

🎯 Professional Summary

Passionate Data Engineer with expertise in building scalable ETL pipelines and democratizing access to public data. Creator of the Osservatorio platform, which transforms Italian statistical data into accessible insights. Strong advocate for open-source solutions with a focus on performance optimization and data quality.

🛠️ Technical Skills

Data Processing

  • Python (Advanced)
  • pandas, numpy
  • dbt-core
  • Data Modeling

Storage & Databases

  • DuckDB
  • PostgreSQL
  • Parquet
  • SQLite

Analytics & BI

  • Streamlit
  • Power BI
  • Plotly
  • Excel (Advanced)

DevOps & Tools

  • Git/GitHub
  • GitHub Actions
  • Poetry
  • Docker

🚀 Featured Projects

Osservatorio - Open Data Analytics Platform
Python • Streamlit • DuckDB • GitHub Actions

Production-ready platform democratizing access to Italian statistical data through automated ETL pipelines and interactive dashboards. Growing community with 4+ active contributors and 65% test coverage.

  • Built robust ETL pipelines for ISTAT data with automatic retries and circuit breakers
  • Implemented multi-format export (CSV, Excel, Parquet) for maximum interoperability
  • Achieved sub-100ms analytics queries with hybrid persistence architecture
  • Established contributor-friendly architecture with comprehensive documentation
Mini-Lakehouse-Didattico
dbt • DuckDB • Testing Framework

Educational modern data stack implementation demonstrating dbt + DuckDB for ultra-fast analytics with automated testing.

  • Template ready for real-world projects with automated testing via dbt-expectations
  • Multi-layer architecture: staging → core → marts
CruscottoPMI - Business Intelligence for SMEs
Streamlit • XBRL • Financial Analytics

Financial dashboard suite for small-medium enterprises with XBRL integration for automated balance sheet analysis.

  • Automated KPI calculation and what-if analysis capabilities
  • XBRL integration for standardized financial data processing

💼 Professional Experience

Independent Data Engineer & Consultant
Freelance
2020 - Present

Specialized in open data analytics, ETL pipeline design, and business intelligence solutions for public sector and SMEs.

  • Designed and implemented scalable data pipelines processing millions of records daily
  • Built interactive dashboards and analytics tools for data democratization
  • Contributed to open-source ecosystem with production-ready tools and libraries
  • Mentored junior data professionals in modern data stack technologies
  • Established performance benchmarks achieving <100ms query response times

🎯 Core Competencies

Data Architecture

  • Multi-layer data modeling
  • ETL/ELT pipeline design
  • Performance optimization
  • Data quality frameworks

API Integration

  • SDMX parsing
  • JSON/XML processing
  • RESTful API design
  • Government data sources

Development Practices

  • Test-driven development
  • CI/CD automation
  • Code review & mentoring
  • Documentation-first approach

Business Impact

  • Data democratization
  • Open-source advocacy
  • Community building
  • Knowledge transfer

💡 Professional Philosophy

"Public data belongs to everyone, it should be accessible to everyone"


Guiding Principles:

  • Transparency: Every transformation traceable and documented
  • Performance: If it's not fast, it's not finished
  • Quality: Test first, debug later
  • Openness: No vendor lock-in, maximum portability