Case Study

Lakekeeper

An Iceberg REST catalog in Rust with vended credentials, OpenFGA, and a small operational footprint.

Open source contribution Rust Apache Iceberg REST Catalog OpenFGA Lakehouse
Lakekeeper architecture overview

System anatomy

  1. Inputs

    • Iceberg REST requests
    • OIDC identities
    • Cloud storage credentials
    • OpenFGA authorization model
  2. Core

    • Rust REST catalog service
    • Postgres-backed metadata
    • Vended credentials issuer
    • CloudEvents emitter
  3. Outputs

    • Iceberg table metadata
    • Temporary storage credentials
    • Cross-engine query access
    • Audit + change events
Constraints
  • No JVM
  • Monthly release cadence
  • Multi-engine interoperability
  • Identity-aware by default

Why it exists

Object storage and open table formats are not enough on their own. A practical lakehouse still needs a catalog that can manage metadata, enforce access, and stay open enough to work across engines and deployment models. Lakekeeper matters because it puts the control plane around Iceberg tables into a Rust-native service that can be reasoned about as infrastructure rather than as an opaque extension of a compute engine.

Technical center

Lakekeeper implements the Iceberg REST catalog in Rust, pairing Postgres-backed metadata with OIDC authentication, OpenFGA authorization, vended cloud credentials, CloudEvents, and a deployment model that stays much lighter than JVM-oriented catalog stacks. The architecture is interesting because the catalog is not only a lookup service: it becomes the place where identity, temporary storage access, namespace management, and cross-engine interoperability meet.

Current proof points

The public footprint is already concrete: a long-form architecture write-up, documented RisingWave interoperability, integrations with external engines and managed platforms, a monthly release cadence, and 2 merged PRs tracked in the repository README rather than a vague claim of familiarity. The case study is especially useful as a catalog-level counterpart to the Rust data stack work: it shows where table metadata, governance, and engine access stop being abstract and become operational interfaces.