Data Development Kit (DDK)

The Data Development Kit (DDK) is a powerful code generation tool that automatically creates production-ready GraphQL servers with PostgreSQL and Redis integration from declarative schema definitions. It eliminates boilerplate by generating complete CRUD operations, resolvers, and database models.

Who is this for?

Data engineers who need to provision a structured data backend for an OR application — define a schema, get a running GraphQL server
Modelling teams who need a data layer for workflow inputs, outputs, or experiment results
Backend engineers who want a generated, extensible Go GraphQL server without writing the boilerplate themselves
Any OR developer building an application that needs a data backend — define your schema, get a running API

Platform Context

Why This System Exists

Every physical AI application needs structured data: a rail network's timetables and disruption records, a mine's equipment and shift data, a logistics operation's fleet and job assignments. Providing that data through a reliable, queryable API typically requires building a backend service — defining a data schema, writing a database layer, implementing CRUD operations, handling relationships, setting up migrations, and containerising the whole thing.

That work is largely the same regardless of the domain. The schema changes; the structural effort does not.

The DDK exists to eliminate this repeated cost. By describing a data model in a simple schema file, a team gets a complete, production-ready GraphQL server — with a database, full CRUD operations, relationship handling, and a Docker configuration — without writing any server code. This lets teams go from "we need a data backend" to a running, queryable service in a fraction of the time it would otherwise take.

What this enables in practice:

A transport team modelling a rail network can define their data entities (routes, trains, platforms, disruption events) in a schema file and get a live GraphQL API within minutes. They spend time on domain logic, not backend engineering.
A project team building an MDK workflow that needs to export results to a structured data store can spin up a DDK server as the target — no custom backend required. The MDK workflow connects to it directly via data connection tasks.
The MDK itself is built on the DDK. The Experiment Manager's entire data layer — workflows, experiments, studies, models, results — is a DDK-generated server, demonstrating the system's fitness for production use at platform scale.

Core Concepts and Principles

Schema definition drives everything — including the generation process itself

We chose schema-first for a reason that goes beyond developer experience: in the OR target environment, data models need to be auditable, version-controlled, and legible to people who are not backend engineers. A GraphQL SDL file in a git repository satisfies all three requirements. The alternative — code-first generation, where you write Go structs and derive the schema from annotations — produces a schema that is harder to read, harder to review in a pull request, and less useful as a communication artifact. We also chose GraphQL over REST for the generated API surface. GraphQL's introspection capability is not incidental; it is how the FDK builds typed data connections at design time, and how the MDK's compatibility system validates that workflow task outputs and inputs are type-compatible before a trial runs. A REST API would require OpenAPI spec maintenance or manual connection configuration — GraphQL introspection makes it automatic.

A DDK server starts with a single schema file written in GraphQL SDL, annotated with DDK-specific directives. The @required directive marks which types should be persisted as database tables and which CRUD operations to generate. The @constraint directive adds database-level rules (primary keys, unique constraints, check constraints, default values). The @mapping directive defines relationships between types (one-to-one, one-to-many, many-to-many). From these declarations, the DDK generates a complete Go server using gqlgen for the GraphQL engine and GORM for database access — meaning the generated code is idiomatic, well-typed, and production-quality rather than templated glue code. The Schema Guide is the full reference for all directives and their options, and real-world schema examples show how they combine to model complex domain data. To see this approach at platform scale in production, the MDK Architecture page describes how the MDK's own Experiment Manager backend — storing workflows, experiments, models, and results — is itself a DDK-generated server. That is deliberate: we did not build the MDK on a hand-rolled backend and build the DDK separately. The MDK's backend is the DDK's production validation. Every constraint the DDK imposes, every pattern it enforces, and every limitation it carries, the MDK team operates within. If the DDK could not handle the complexity of the Experiment Manager's schema — 15+ entity types, complex relationships, custom resolvers layered on top of generated CRUD — it would not be production-grade for domain teams either. It handles it, and the MDK team's ability to regenerate the Experiment Manager's entire data layer from a schema change is proof.

Custom logic survives regeneration — by design

The DDK separates generated code from custom business logic at the file level. Standard CRUD resolvers live in auto-generated *.resolvers.go files and are overwritten on each regeneration. Custom queries, mutations, and domain logic live in *.custom.resolvers.go files that the DDK always preserves. This separation means data models can evolve — add a field, change a relationship, introduce a new entity — by editing the schema and regenerating, without losing any custom logic that was built on top.

This is the problem that kills most code generation approaches. The tools that generate code you can modify freely work until you need to regenerate, at which point you either lose your changes or maintain a fork of the generator. The tools that do not let you modify generated code keep the generation clean but force you to write business logic in a separate layer that is awkward to compose with the generated API. The DDK's file-level naming convention — *.resolvers.go is generated, *.custom.resolvers.go is yours — is not a complex mechanism, but it is the right one: simple to understand, safe to automate, and robust to schema evolution. It is paired with a bidirectional workspace migration that syncs generated and custom files in both directions, so a developer working in the workspace and a CI/CD pipeline working in the server directory always stay consistent. Getting this right required more iteration than the schema directives themselves. Teams can extend DDK servers with arbitrarily complex behaviour while maintaining a clean, regenerable foundation. The Custom Resolvers page covers how to write these extensions — including how to reference generated types, write complex filtering logic, and add computed fields — and the Architecture page documents exactly what the DDK overwrites versus what it preserves during the regeneration process.

Servers are provisioned on demand through the platform

The DDK is not a standalone CLI tool — it runs as a gRPC service within the OR SDK. When a team needs a new data server, they call the DDK plugin through the platform rather than running scripts locally. The SDK invokes the DDK's CreateServer or ReGenerateServer gRPC methods, passing the schema definition and server configuration. The DDK generates and wires up the server in-place. This means server provisioning is a platform operation — consistent, repeatable, and available to every OR user without local tooling setup. The Usage page covers the full provisioning workflow, and the Architecture page documents the gRPC interface in detail — the available methods, what parameters they accept, and how the code generation pipeline flows from schema to running service. The SDK API DDK component shows how the SDK API layer exposes this capability to the rest of the platform.

Consistent architecture across all data services

Every DDK-generated server shares the same architecture: Go + gqlgen for the GraphQL layer, GORM for database access, PostgreSQL for persistence, Redis for caching, and Docker for containerisation. This consistency means any developer familiar with one DDK server can navigate any other.

The Go choice is deliberate and worth explaining. Most backend code generation tools in the GraphQL space generate JavaScript/TypeScript (Apollo Server, Nexus) or Python (Strawberry, Ariadne). Go offers compile-time type safety that dynamic languages cannot, which matters when the code is generated: the DDK's generator can produce Go that fails to compile if the schema contains errors, catching mistakes before a server is deployed. Go's performance profile is also appropriate for data-serving workloads in an enterprise Kubernetes environment. The cost is ecosystem: Go is not where most data scientists live, which is why models run in Python or Julia and communicate with DDK servers over GraphQL rather than sharing a runtime.

The Architecture page documents what the consistent server structure looks like at the code level — the six gqlgen plugins, the generated file layout, and the Docker environment. This predictability is why the MDK can reliably connect to any DDK server via IMPORT_DDK and EXPORT_DDK workflow tasks.

Role in the Optimal Reality Platform

The DDK operates as a data backend generation layer within the Optimal Reality platform. Rather than occupying a fixed position in the runtime data flow, it functions as a developer-facing code generator that produces the data servers other platform components consume.

The DDK is responsible for:

Schema-to-server generation — translating declarative schema definitions into complete Go GraphQL servers with PostgreSQL integration
CRUD automation — eliminating hand-written data access boilerplate for any OR service that needs structured persistence
Server scaffolding — generating the full service structure: ORM models, resolvers, database migrations, and Docker configuration

The DDK is used by:

MDK — the MDK's backend (or-app-experiment-manager) is itself a DDK-generated server
Application teams — any team needing a structured data backend for an OR application can use the DDK to provision it from a schema definition

How It Fits in the End-to-End Platform

Developer defines a schema (.graphqls file with DDK directives)
        ↓
DDK (gRPC plugin at port 7788, called via or-sdk-api)
        ↓  generates
Complete GraphQL Server (Go + gqlgen + GORM + PostgreSQL + Redis + Docker)
        ↓
Used as:
├── MDK backend — stores workflows, experiments, models, results
└── Domain data servers — user-created data layers for any domain
        ↓
Accessible from:
├── MDK workflows — via IMPORT_DDK / EXPORT_DDK task types
└── FDK applications — via data connections to DDK servers

The DDK is not a runtime service users interact with directly — it is a build-time tool that produces the servers they work with. Its output appears throughout the OR platform, most prominently as the foundation of the MDK itself.

The key difference from tools like Hasura or PostGraphile is ownership: those tools generate an API at runtime from an existing database schema — fast to start, but limited in custom business logic. The DDK generates source code your team owns and can extend arbitrarily. The regeneration step is a cost that pays for itself in control and extensibility.

Overview

The DDK enables rapid backend development by generating complete GraphQL servers with:

Automatic CRUD Operations - Create, Read, Update, Delete for all types
PostgreSQL Integration - Full database support with GORM ORM
Redis Caching - Built-in caching layer for performance
Relationship Mapping - One-to-one, one-to-many, many-to-many, back-references
Custom Resolvers - Extend with custom business logic
Schema Validation - Validate before generation
Constraints - Primary keys, unique, check, default values
Docker Environment - Complete containerized setup

Quick Start

1. Define Your Schema

Create a .graphqls file with your data model:

graphql

type User @required(type: "CREATE,READ,UPDATE,DELETE", table: "true") {
  id: ID! @constraint(type: "primarykey")
  name: String!
  email: String! @constraint(type: "unique")
}

2. Create Your Server

Use the SDK API to call the DDK service (via gRPC on port 7788):

Provide server configuration (name, port, paths)
Database connection details (PostgreSQL)
Your schema definition

3. Generated Output

The DDK generates:

Complete GraphQL server in Go
Database models with GORM
CRUD resolvers for all operations
Docker Compose setup
Database migrations

4. Start and Test

Start your server
Access GraphQL Playground
Test your API with auto-generated queries and mutations

Key Features

Declarative Schema Definition

Define data models using GraphQL SDL with custom directives:

graphql

type Product @required(type: "CREATE,READ,UPDATE,DELETE", table: "true") {
  id: ID! @constraint(type: "primarykey")
  name: String!
  price: Float! @constraint(type: "check", value: "price > 0")
  inStock: Boolean @constraint(type: "default", value: "true")
}

Automatic CRUD Operations

For each type, the DDK generates:

Queries: getProduct(id: ID!), listProducts(...)
Mutations: createProduct(...), updateProduct(...), deleteProduct(id: ID!)
Field filtering, sorting, and pagination

Relationship Support

Define complex relationships declaratively:

graphql

type User @required(type: "CREATE,READ,UPDATE,DELETE", table: "true") {
  id: ID! @constraint(type: "primarykey")
  posts: [Post]!
    @mapping(
      type: "one2many"
      foreignKey: "authorId"
      foreignKeyReference: "id"
    )
}

Custom Business Logic

Extend generated code with custom resolvers:

graphql

extend type Query {
  getUserByEmail(email: String!): User! @resolver(type: "CUSTOM")
}

Built With

gqlgen - Go GraphQL server generation
GORM - Go ORM for database operations
PostgreSQL - Relational database
Redis - In-memory caching
gRPC - Service communication
Docker - Containerization

Go deeper

Architecture — How the DDK generates servers from schemas
Schema Guide — Complete guide to writing DDK schemas with directives and relationships
Custom Resolvers — Extending generated code with custom business logic
Usage — How to use the DDK service to create and manage servers
Examples — Real-world schema examples and patterns
Frontend — Visual interface for schema design and server management
FAQs — Frequently asked questions

Connect to the rest of the platform:

MDK — The MDK's backend is a DDK-generated server; IMPORT_DDK and EXPORT_DDK connect to DDK servers

Creating a Data Schema

Building and Configuring Workflows

DDK (Data)

MDK (Modelling)

Modelling Library

FDK (Frontend)

Nexus (Deployment)

Data Development Kit (DDK)

Who is this for?

Platform Context

Why This System Exists

Core Concepts and Principles

Role in the Optimal Reality Platform

How It Fits in the End-to-End Platform

Overview

Quick Start

1. Define Your Schema

2. Create Your Server

3. Generated Output

4. Start and Test

Key Features

Declarative Schema Definition

Automatic CRUD Operations

Relationship Support

Custom Business Logic

Built With

Go deeper

Modelling Library

Data Development Kit (DDK) ​

Who is this for? ​

Platform Context ​

Why This System Exists ​

Core Concepts and Principles ​

Role in the Optimal Reality Platform ​

How It Fits in the End-to-End Platform ​

Overview ​

Quick Start ​

1. Define Your Schema ​

2. Create Your Server ​

3. Generated Output ​

4. Start and Test ​

Key Features ​

Declarative Schema Definition ​

Automatic CRUD Operations ​

Relationship Support ​

Custom Business Logic ​

Built With ​

Go deeper ​

Data Development Kit (DDK)

Who is this for?

Platform Context

Why This System Exists

Core Concepts and Principles

Role in the Optimal Reality Platform

How It Fits in the End-to-End Platform

Overview

Quick Start

1. Define Your Schema

2. Create Your Server

3. Generated Output

4. Start and Test

Key Features

Declarative Schema Definition

Automatic CRUD Operations

Relationship Support

Custom Business Logic

Built With

Go deeper