Case Study

DataPizza AI

A Python GenAI framework for production-ready agents and RAG pipelines with minimal abstraction.

Active OSS project Python GenAI RAG Agents OpenTelemetry
DataPizza AI project logo

System anatomy

  1. Inputs

    • LLM provider keys
    • PDF / DOCX / image docs
    • Natural-language prompts
    • Tool / function definitions
  2. Core

    • Multi-provider LLM client
    • Tool-calling agent loop
    • Document ingestion + chunking
    • Qdrant vector search
  3. Outputs

    • Agent responses + tool results
    • RAG-grounded answers
    • Persistent conversation state
    • OpenTelemetry traces
Constraints
  • Vendor-agnostic by design
  • No hidden abstraction
  • Provider swap without logic rewrite
  • Observable throughout

Why it exists

Most GenAI frameworks overcorrect toward magic: giant abstractions that collapse when you need to understand what is actually happening between the model call and the result. DataPizza AI starts from the other direction — keep the surface small, keep the providers swappable, and make the tracing explicit enough that a team can debug a failing RAG pipeline without reading the framework internals.

Technical center

The framework layers a multi-provider LLM client layer (OpenAI, Gemini, Anthropic, Mistral, Azure) under tool-calling agents, then above that a document ingestion and chunking pipeline (Docling, Azure Document Intelligence) that feeds into Qdrant-backed vector search with Cohere or FastEmbed embedders and optional reranking. Redis caching and OpenTelemetry tracing run across all layers. The architectural discipline is that provider switching, agent composition, and pipeline wiring are all done declaratively without rewriting business logic.

Current proof points

Five merged PRs cover the layers where the framework's less-abstraction mandate creates the most friction in practice. PR #55 added async Bedrock support (a_invoke and a_stream_invoke via aioboto3) to unblock async-first application patterns. PR #67 fixed IngestionPipeline.run() to actually accept a list of file paths as its own documented signature promised. PR #77 added a metadata parameter to AzureParser and DoclingParser with runtime type validation so pipelines can pass context through the ingestion layer without hitting a TypeError. PR #97 introduced an async context manager on MCPClient for persistent sessions, which matters for any stateful MCP server that maintains database connections or authentication state across tool calls. PR #106 refactored FastEmbedder to use fastembed's native batch processing and asyncio.to_thread() — removing duplicated logic and fixing the async path so blocking I/O no longer runs on the event loop.