Appearance
MDK Architecture
Overview
The MDK (Modelling Development Kit) is built on a modular architecture that orchestrates computational workflows across independent models, operators, and data connections. The system is designed to coordinate multi-language, multi-service workflows while keeping the complexity of integration hidden from users.
Core Components
The MDK consists of several key components working together:
1. Experiment Manager
The central API server that manages all MDK entities — workflows, models, experiments, and trials. It provides a GraphQL API for frontend clients and coordinates all workflow execution activities.
Responsibilities:
- Store workflow definitions and versioning
- Manage experiment and trial configurations
- Coordinate workflow execution through the WEM
- Track model registry and discovery
2. Workflow Execution Manager (WEM)
The runtime engine that orchestrates workflow execution. The WEM handles task scheduling, dependency resolution, state management, and output coordination.
See the Workflow Execution Manager page for detailed information about how workflows are executed.
3. Data Abstraction Layer (DAL)
A data transfer system that enables seamless communication between models written in different languages. The DAL provides multiple transfer modes optimized for different data sizes and use cases.
See the Data Flow & Transformations page for detailed information about how data moves between tasks.
4. Model Registry
A discovery and registration system for model microservices. When model containers are deployed, they register themselves with the Experiment Manager, making their capabilities available to workflows.
Capabilities:
- Automatic discovery of deployed models
- Schema introspection from OpenAPI/Swagger definitions
- Version tracking and model metadata storage
5. Federation Layer
A unified GraphQL API layer that federates the Experiment Manager with user-created data servers. This allows frontend applications to query both workflow data and domain data through a single endpoint.
Purpose:
- Unified query interface for complex applications
- Seamless integration between MDK and DDK-generated servers
- Simplified frontend development
How Components Work Together
The MDK architecture follows a clear data flow:
User Interface
↓
Federation Layer
↓
Experiment Manager (GraphQL API)
↓
Workflow Execution Manager (WEM)
↓
┌─────────────────────────────────────────┐
│ Model Microservices │
│ - Python models (FastAPI) │
│ - Julia models (Oxygen.jl) │
│ - Go models │
│ - Custom operators │
│ - Data connections (DDK integration) │
└─────────────────────────────────────────┘
↓
Data Abstraction Layer (DAL)
↓
Persistent Storage (Database + Cache)Workflow Execution Flow
- User defines a workflow using the visual workflow builder
- Workflow is stored with versioning in the Experiment Manager
- User creates an experiment with specific parameter values
- Execution is triggered and the WEM takes over:
- Builds the task dependency graph
- Identifies execution-ready tasks
- Schedules tasks based on priority
- Invokes models through their HTTP endpoints
- Coordinates data transfer via the DAL
- Tracks execution state and captures outputs
- Results are stored for comparison across trials
- User reviews outcomes and iterates
Model Integration Flow
- Model is developed in Python, Julia, or Go
- Model container is deployed to the platform
- Experiment Manager discovers the model through its OpenAPI schema
- Model is registered and becomes available in the workflow builder
- Model can be used as a task in any workflow
Data Flow Between Tasks
- Task A completes and produces output
- Output is written to the appropriate DAL location (file system, cache, or database)
- WEM identifies which downstream tasks depend on Task A's output
- Input mappings are resolved by matching output field names to input field names
- Task B reads the output through the DAL
- Execution continues through the dependency chain
Design Principles
Schema-First Design
The Experiment Manager is generated from declarative GraphQL schemas. This means the data model is defined first, and the server code is generated automatically. Changes to the schema trigger regeneration, keeping the API consistent with the data model.
Microservice Orchestration
Unlike traditional workflow engines designed for homogeneous compute environments, the MDK is built for heterogeneous microservice orchestration. Models can be written in any language, run in separate containers, and communicate through a standardized HTTP contract.
Federation for Unified Access
The federation layer provides a single GraphQL endpoint that spans multiple data sources. This architectural choice enables complex frontend applications that need to query both workflow data (experiments, trials, results) and domain data (rail networks, logistics fleets, etc.) without managing multiple API connections.
Separation of Concerns
- Experiment Manager handles data persistence and API serving
- WEM handles execution orchestration and scheduling
- DAL handles cross-language data transfer
- Model containers handle domain-specific computation
- PyRunner handles custom Python operator execution in isolation
This separation means each component can evolve independently, and failures in one area don't cascade through the system.
Deployment Model
The MDK is deployed as a collection of services:
- Experiment Manager — Core API server
- Database — Persistent storage for workflows, experiments, trials, and models
- Cache — Fast storage for DAL data transfer and result caching
- Federation Gateway — Unified GraphQL endpoint
- PyRunner — Python operator execution service
- Model Collections — Individual model containers deployed on demand
All services run in a container orchestration environment, making the MDK portable across different infrastructure providers.
See Also
- Workflow Execution Manager — Deep dive into how workflows are orchestrated
- Data Flow & Transformations — How the DAL enables cross-language data communication
- Component Types — The four task execution modes (MODEL, OPERATOR, IMPORT_DDK, EXPORT_DDK)
- Data Schema — The entity hierarchy (Study → Experiment → Trial) and data model
