Case Study

DCE

Proactive data contracts enforcement built from dataprof's profiling core.

Pre-release engine Rust Data Contracts Data Quality Iceberg YAML
DCE project logo

System anatomy

  1. Inputs

    • contract.yml files
    • Iceberg tables
    • Sample dataset slices
    • CI / orchestration triggers
  2. Core

    • Contract validation engine
    • Format-aware checks
    • dataprof profiler core
    • CLI commands (validate / init / check)
  3. Outputs

    • Pass/fail exit code
    • Structured violation reports
    • Schema + quality verdicts
    • Documented limitations
Constraints
  • Iceberg-first
  • Documented format gaps
  • Lightweight, no governance UI
  • CI-friendly

Why it exists

Profiling is useful after the fact, but data platforms also need an explicit way to define what must be true before pipelines keep moving. DCE extends the dataprof line of work toward enforcement: instead of only describing a dataset, the system should be able to fail a run when contracts around schema, quality, or table semantics are violated.

Technical center

DCE grows out of dataprof by moving from observation to enforcement: contract definitions, validation commands, and table-format-aware checks become the center of the workflow. The technical challenge is keeping contracts expressive enough for real data platforms while making them practical to run from a CLI, CI job, or future orchestration layer without creating another heavyweight governance product.

Current proof points

The shift from profiler to contract engine is already visible in the public repo through Iceberg-aware validation, explicit known limitations for non-Iceberg formats, 194 tests across packages, and a roadmap that makes the next steps in SQL execution and multi-format validation explicit. The useful signal is that limitations are documented alongside the implementation, which keeps the project honest about what can be enforced today and what still needs engineering work.