MDK Data Schema

Overview

The Experiment Manager's data schema is defined using GraphQL SDL with DDK directives and is split across three schema files. The schema drives both the PostgreSQL database structure (via GORM auto-migration) and the GraphQL API surface.

ORM Schema: server/graph/schema/orm.graphqls — Types mapped to database tables DTO Schema: server/graph/schema/dto.graphqls — Data transfer objects for custom resolvers Custom Schema: server/graph/schema/custom.graphqls — Custom queries, mutations, and subscriptions

Schema Layers

ORM (Object-Relational Mapping)

Types annotated with table:"true" are mapped directly to PostgreSQL tables by GORM. These form the persistent data model:

Workflow > WorkflowTask > ModelConfig > Model > ModelConfigInput + Field
                                              > ModelConfigOutput + Field
                                              > ModelRegistryVersion > ModelRegistry

Key DDK directives used:

@required(type: "CREATE,READ,UPDATE,DELETE", table: "true") — Generates CRUD resolvers and creates a DB table
@constraint(type: "primarykey") — Marks the primary key column
@constraint(type: "unique") — Adds a unique constraint
@mapping(type: "one2many", ...) — Defines one-to-many relationships
@mapping(type: "backRef", ...) — Defines back-reference (belongs-to) relationships

DTO (Data Transfer Objects)

Types annotated with table: "false" are not persisted — they exist only as GraphQL types returned by custom resolvers:

WorkflowExecuteResponse — Response from workflow/experiment execution
WorkflowState / WorkflowTaskState — Runtime execution state
WorkflowTaskOutput — Task output data and visualisation metadata
CompatibilityResult — Input/output compatibility check results
FunctionTemplate — Generated code for operators and data connections
WorkflowAlert / ModelAlert — Validation alerts for workflows and models

Custom (Resolvers)

Extended Query, Mutation, and Subscription types with @resolver(type:"CUSTOM") directive. These contain all business logic beyond generated CRUD operations — workflow execution, DAG generation, operator management, import/export, versioning, and deployment.

OR Types System

All OR model services communicate using a standardised HTTP contract. Understanding this contract is essential for working with the MDK schema.

Core Request/Response Types

ORHttpRequest

Every request to an OR model follows this structure:

python

# Python
class ORHttpRequest(BaseModel, Generic[ModelInput, ModelParameters]):
    version: Union[None, str]
    input: Union[ModelInput, List[DataInput]]
    params: ModelParameters
    output_path: OROperatorOutputPath

julia

# Julia
struct ORHttpRequest{I,P}
    version::Union{Nothing,String}
    input::I
    params::P
    output_path::OROperatorOutputPath
end

Field	Type	Description
`version`	string (optional)	API version for compatibility tracking
`input`	`ModelInput` or `List[DataInput]`	The primary data the model processes
`params`	`ModelParameters`	Configuration/tuning parameters
`output_path`	`OROperatorOutputPath`	Where to store results (workflow ID, storage backend)

ORHttpResponse

Every response from an OR model follows this structure:

python

# Python
class ORHttpResponse(BaseModel, Generic[ModelOutput]):
    output: Union[ModelOutput, List[DataInput]]
    metadata: Optional[Dict[str, Union[str, float]]] = None
    logs: Optional[List[str]] = None

Field	Type	Description
`output`	`ModelOutput` or `List[DataInput]`	Results from the model
`metadata`	dict (optional)	Runtime metrics (execution time, version, etc.)
`logs`	list of strings (optional)	Debug/info logs from execution

The Three-Type Pattern: Input, Params, Output

Every OR model endpoint defines three custom types:

Input (I) + Parameters (P) → [Model Logic] → Output (O)

Input Types

Purpose: The data your model processes — domain objects your algorithm operates on.

Two forms:

Direct Input — Plain structs with data fields (for testing, simple workflows, same-language services)
DAL Input — List[DataInput] for cross-language communication via the Data Abstraction Layer

python

class HelloWorldInputs(BaseModel):
    name: str = "John Doe"
    age: int = 30

Params Types

Purpose: Configuration that controls how the model behaves — tuning knobs, not the data being processed.

Key distinction: Inputs = "what to process" (changes every request), Params = "how to process it" (configuration, often stable).

python

class HelloWorldParameters(BaseModel):
    statement: str = "Hello there"
    exclamation: bool = True

Best practice: Always provide sensible defaults so params are optional.

Output Types

Purpose: Results the model produces, consumed by downstream tasks or users.

Compatibility: If the output type is equivalent to an input type (matching field names and types), it can be passed directly as an output dependency to the matching model/task.

python

class HelloWorldResponse(BaseModel):
    greeting: str = "Hello there John Doe! You are 30 years old"
    when: int = 1

Params vs Input in Swagger

The Swagger/OpenAPI schema must clearly separate data inputs from params so the MDK can introspect and consume them for workflow use:

julia

struct RoadSolveRequest
    params::RoadSolveBlobParams           # Configuration parameters
    inputs::Union{RoadSolveBlobInput, FileInput}  # Data inputs (user or DAL)
    outputPath::String                     # BE only, ignored by UI
    tempPath::String                       # BE only, ignored by UI
end

The inputs field is a union: it can be a user-provided swagger type (for manual/default input via UI) or a FileInput type (when data comes from a preceding task via DAL).

Summary

The OR HTTP contract standardises model communication through:

ORHttpRequest — Wraps input (data), params (config), and output_path (storage)
ORHttpResponse — Wraps output (results), metadata (metrics), and logs (debug)
Three-Type Pattern — Every endpoint defines Input, Params, and Output types
DAL Integration — Seamless cross-language data exchange via List[DataInput]

Schema Hierarchy

The MDK schema follows a hierarchical structure:

Study
└── Experiment(s)
    ├── ExperimentTaskConfig(s)     ← per-task config overrides
    └── Trial(s)                    ← individual execution runs
        ├── TrialTaskValue(s)       ← snapshotted config per task
        └── TrialTaskOutput(s)      ← output data per task
    └── Workflow
        └── WorkflowTask(s)
            ├── ModelConfig
            │    └── Model
            │         ├── ModelConfigInput(s)
            │         │    └── ModelConfigInputField(s)
            │         └── ModelConfigOutput(s)
            │              └── ModelConfigOutputField(s)
            ├── Function (optional)
            └── WorkflowTaskDependency(ies)

This hierarchy enables:

Experimentation — Studies contain multiple experiments with different configurations
Reusability — Workflows can be versioned and reused across experiments, or executed in isolation
Modularity — Tasks are independently defined with clear inputs/outputs
Traceability — Dependencies track data flow between tasks

Core Schema Components

1. Study and Experiment

Study

A container for related experiments that share a common research goal.

graphql

type Study @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  name: String!
  description: String
  createdBy: String!
  createdAt: Int!
  experiments: [Experiment]! @mapping(type:"one2many", foreignKey:"studyId", foreignKeyReference:"id")
}

Field	Type	Description
`id`	ID	Unique identifier
`name`	String	Human-readable name
`createdBy`	String	Creator identifier
`experiments`	[Experiment]	One-to-many relationship to experiments

Purpose: Organise related experiments under a common research goal.

Experiment

A configured experiment that references a workflow and contains trials (individual execution runs). Experiments are read-only via auto-generated resolvers — creation and management is handled through custom resolvers.

graphql

type Experiment @required(type:"READ", table:"true") {
  id: ID! @constraint(type: "primarykey")
  name: String!
  description: String
  properties: JSON
  studyId: String!
  workflowId: String!
  createdAt: Int!
  study: Study! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"studyId")
  workflow: Workflow! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"workflowId")
  trials: [Trial]! @mapping(type:"one2many", foreignKey:"experimentId", foreignKeyReference:"id")
  taskConfigs: [ExperimentTaskConfig]! @mapping(type:"one2many", foreignKey:"experimentId", foreignKeyReference:"id")
}

Field	Type	Description
`properties`	JSON	Experiment configuration properties (e.g., `experimentType: MANUAL`)
`studyId`	String	References the parent study
`workflowId`	String	References the workflow to execute
`trials`	[Trial]	Individual execution runs under this experiment
`taskConfigs`	[ExperimentTaskConfig]	Per-task configuration overrides for this experiment

Note

The @required(type:"READ") directive means only auto-generated read resolvers exist. Experiment creation, duplication, and execution are handled by custom resolvers (executeExperiment, duplicateExperiment).

ExperimentTaskConfig

Per-experiment overrides for individual task configurations. These define the experiment-level configValue that takes precedence over the task and model defaults.

graphql

type ExperimentTaskConfig @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  experimentId: ID! @constraint(type: "primarykey")
  workflowTaskId: ID! @constraint(type: "primarykey")
  configValue: JSON
  experiment: Experiment! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"experimentId")
  workflowTask: WorkflowTask! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"workflowTaskId")
}

Field	Type	Description
`experimentId`	ID	Part of composite primary key — references the experiment
`workflowTaskId`	ID	Part of composite primary key — references the workflow task
`configValue`	JSON	Experiment-specific configuration override for this task

Configuration Precedence (highest to lowest):

ExperimentTaskConfig.configValue — Experiment-specific override
WorkflowTask.configValue — Task-level configuration
ModelConfig.configValue — Default model configuration

Trial

An individual execution run within an experiment. Each trial captures a snapshot of the configuration at execution time and tracks its own status and outputs independently.

graphql

enum TrialStatus {
  PENDING
  RUNNING
  COMPLETED
  FAILED
  EXPIRED
  PROMOTED
}

type Trial @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  experimentId: String!
  name: String
  status: TrialStatus
  outputSummary: JSON
  promoted: Boolean!
  createdAt: Int!
  executedAt: Int
  completedAt: Int
  experiment: Experiment! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"experimentId")
  trialTaskValues: [TrialTaskValue]! @mapping(type:"one2many", foreignKey:"trialId", foreignKeyReference:"id")
  trialOutputs: [TrialTaskOutput]! @mapping(type:"one2many", foreignKey:"trialId", foreignKeyReference:"id")
}

Field	Type	Description
`experimentId`	String	References the parent experiment
`status`	TrialStatus	`PENDING`, `RUNNING`, `COMPLETED`, `FAILED`, `EXPIRED`, or `PROMOTED`
`outputSummary`	JSON	Aggregated summary of trial outputs
`promoted`	Boolean	Whether this trial has been promoted (selected as best result)
`executedAt`	Int	Unix timestamp when execution began
`completedAt`	Int	Unix timestamp when execution finished
`trialTaskValues`	[TrialTaskValue]	Snapshotted config values used for this trial run
`trialOutputs`	[TrialTaskOutput]	Per-task output data from this trial

Trial Lifecycle:

Trial created with PENDING status (auto-created by executeExperiment or pre-created)
On execution: status → RUNNING, executedAt set
Config values snapshotted into TrialTaskValue records (immutable after creation)
On completion: status → COMPLETED or FAILED, completedAt set
Optionally promoted via promoted flag

TrialTaskValue

A snapshot of the configuration used for a specific task in a specific trial. Created at trial execution time and immutable afterward — this ensures reproducibility even if the experiment or task configs change later.

graphql

type TrialTaskValue @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  workflowTaskId: ID! @constraint(type: "primarykey")
  trialId: ID! @constraint(type: "primarykey")
  configValue: JSON
  trial: Trial! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"trialId")
  workflowTask: WorkflowTask! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"workflowTaskId")
}

Snapshot Logic (from executeExperiment resolver):

For each WorkflowTask in the workflow:
  1. Check ExperimentTaskConfig override → highest priority
  2. Fall back to WorkflowTask.configValue
  3. Fall back to ModelConfig.configValue
  → Persist the resolved value as a TrialTaskValue

TrialTaskOutput

Stores the output data from a task execution within a specific trial.

graphql

type TrialTaskOutput @required(type:"CREATE,UPDATE,DELETE", table:"true") {
  workflowTaskId: ID! @constraint(type: "primarykey")
  trialId: ID! @constraint(type: "primarykey")
  outputType: WorkflowTaskOutputType!
  visualisationType: JSON
  size: Int
  data: String
  logs: String
  state: ExecutionState
  execTime: Float
  trial: Trial! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"trialId")
  workflowTask: WorkflowTask! @mapping(type:"backRef", foreignKey:"id", foreignKeyReference:"workflowTaskId")
}

Field	Type	Description
`outputType`	WorkflowTaskOutputType	`JSON`, `CSV`, or `OTHER`
`visualisationType`	JSON	How to render the data (GeoJSON, charts, etc.)
`size`	Int	Size of the output data
`data`	String	Raw output data from the task
`logs`	String	Execution logs from the task
`state`	ExecutionState	Final execution state of this task in this trial
`execTime`	Float	Execution duration in seconds

2. Workflow and WorkflowTask

Workflow

A directed acyclic graph defining the execution flow of tasks.

graphql

type Workflow @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  name: String! @constraint(type: "unique")
  description: String
  lastState: ExecutionState!
  parentWorkflowId: String
  isLatest: Boolean!
  version: String!
  versionDescription: String
  lastDagRunId: String
  variables: JSON
  createdAt: Int!
  updatedAt: Int!
  experiments: [Experiment]! @mapping(...)
  workflowTasks: [WorkflowTask]! @mapping(...)
  workflowVersions: [Workflow]! @mapping(...)
  workflowDeployments: [WorkflowDeployment]! @mapping(...)
}

Field	Type	Description
`lastState`	ExecutionState	Current state: `SUCCESSFUL`, `FAILED`, `IN_PROGRESS`, `PENDING`, `SKIPPED`
`parentWorkflowId`	String	References parent workflow for versioning chain
`isLatest`	Boolean	Whether this is the latest/production version
`version`	String	Semantic version string
`lastDagRunId`	String	ID of the most recent execution run
`variables`	JSON	Default variable values for the workflow

Versioning Strategy:

parentWorkflowId creates a version chain between workflow revisions
isLatest flags the current production version
saveWorkflowVersion mutation creates a new version; rollbackWorkflowVersion restores a previous one

WorkflowTask

An individual computational unit within a workflow.

graphql

type WorkflowTask @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  name: String!
  description: String
  workflowId: String!
  modelConfigId: String!
  configValue: JSON
  functionId: String
  configVariables: JSON
  useCache: Boolean
  priority: Int
  function: Function @mapping(...)
  workflow: Workflow! @mapping(...)
  modelConfig: ModelConfig! @mapping(...)
  upstreamTaskDependencies: [WorkflowTaskDependency]! @mapping(...)
  downstreamTaskDependencies: [WorkflowTaskDependency]! @mapping(...)
  experimentTaskConfigs: [ExperimentTaskConfig]! @mapping(...)
  trialTaskValues: [TrialTaskValue]! @mapping(...)
  trialTaskOutputs: [TrialTaskOutput]! @mapping(...)
}

Field	Type	Description
`modelConfigId`	String	References the model configuration to execute
`configValue`	JSON	Task-specific configuration data
`configVariables`	JSON	Variable key → path mapping for runtime substitution
`functionId`	String	Optional reference to a custom function
`useCache`	Boolean	Whether to cache task results (hash-based)
`priority`	Int	Cycle-breaking priority — higher priority tasks are chosen as loop entry points

Configuration Pattern (variable substitution):

json

// configValue
{
  "name": "John",
  "age": 30,
  "address": { "street": "123 Main St", "city": "Boston" }
}

// configVariables — maps variable keys to dot-notation paths in configValue
{
  "$user_name": "name",
  "$user_age": "age",
  "$street": "address.street"
}

// If experiment variables = {"$user_name": "Jane"}
// Result: {"name": "Jane", "age": 30, "address": {"street": "123 Main St", "city": "Boston"}}

3. WorkflowTaskDependency

Defines the relationship between two tasks in the workflow DAG.

graphql

type WorkflowTaskDependency @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  WorkflowTaskID: ID! @constraint(type: "primarykey")
  DownstreamTaskDependencyID: ID! @constraint(type: "primarykey")
  OutputDependency: Boolean
  paramName: String
  config: JSON
  upstreamWorkflowTask: WorkflowTask! @mapping(...)
  downstreamWorkflowTask: WorkflowTask! @mapping(...)
}

Field	Type	Description
`WorkflowTaskID`	ID	The upstream task ID (composite PK)
`DownstreamTaskDependencyID`	ID	The downstream task ID (composite PK)
`OutputDependency`	Boolean	`true` = data dependency, `false` = sequencing only
`paramName`	String	For operators, which parameter receives the upstream output
`config`	JSON	Loop/condition configuration (max iterations, continue/exit conditions)

Dependency Types:

Output Dependency (OutputDependency = true) — Downstream task consumes upstream task's output data. Data transferred via file system or Redis. paramName specifies target input parameter.
Sequential Dependency (OutputDependency = false) — Downstream task waits for upstream completion. No data transfer — just execution ordering.

Loop Configuration (config field):

graphql

input WorkflowTaskDependencyConfig {
  maxIterations: Int
  succeedAfterMaxIterations: Boolean
  continueCondition: ConditionInput
  succeedAfterContinueCondition: Boolean
}

Loop permutations:

maxIterations reached + succeedAfterMaxIterations = true → Skip downstream, continue workflow
maxIterations reached + succeedAfterMaxIterations = false → Fail workflow
continueCondition met + succeedAfterContinueCondition = true → Continue loop
continueCondition met + succeedAfterContinueCondition = false → Fail workflow
continueCondition not met → Skip downstream, continue workflow

Condition System:

graphql

input ConditionInput {
  conditionType: ConditionType!    # CONDITION, AND, OR
  fieldName: String                # Dot-notation path (e.g., "data.status")
  operator: ConditionOperator      # EQ, NEQ, GT, GTE, LT, LTE, CONTAINS, etc.
  fieldValue: JSON                 # Value to compare against
  conditions: [ConditionInput!]    # Nested conditions for AND/OR
}

Supported operators: EQ, NEQ, GT, GTE, LT, LTE, CONTAINS, STARTS_WITH, ENDS_WITH, MATCHES, IN, NOT_IN

4. ModelConfig and Model

ModelConfig

A configured instance of a model ready for use in a workflow task.

graphql

type ModelConfig @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  modelId: String!
  isValid: Boolean
  name: String!
  description: String
  configValue: JSON
  functionId: String
  function: Function @mapping(...)
  workflowTasks: [WorkflowTask]! @mapping(...)
  model: Model! @mapping(...)
}

Field	Type	Description
`modelId`	String	References the base model
`configValue`	JSON	Default configuration for this model instance
`functionId`	String	Optional custom function for operators/data connectors
`isValid`	Boolean	Whether this configuration is validated

Purpose: Provide reusable, pre-configured model instances. Multiple ModelConfigs can reference the same Model with different default configurations.

Model

The blueprint of a computational component.

graphql

type Model @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  modelType: String!
  usecase: String!
  path: String!
  summary: String!
  description: String!
  componentType: ComponentType!
  functionId: String
  modelRegistryVersionId: String
  createdAt: Int!
  function: Function @mapping(...)
  modelConfigs: [ModelConfig]! @mapping(...)
  modelRegistryVersion: ModelRegistryVersion @mapping(...)
  modelConfigInputs: [ModelConfigInput]! @mapping(...)
  modelConfigOutputs: [ModelConfigOutput]! @mapping(...)
  previousModelVersions: [ModelVersionLink]! @mapping(...)
  nextModelVersions: [ModelVersionLink]! @mapping(...)
}

Field	Type	Description
`modelType`	String	Type categorisation (e.g., "AI_TOOL", "TRANSFORMER")
`componentType`	ComponentType	`MODEL`, `OPERATOR`, `IMPORT_DDK`, or `EXPORT_DDK`
`path`	String	API endpoint path (e.g., "/predict", "/merge_operator")
`usecase`	String	Functional use case description
`functionId`	String	For operators/DDKs, references the executable function
`modelRegistryVersionId`	String	For MODEL types, references the container version

Component Types (ComponentType enum):

MODEL — ML models or microservices deployed as Docker containers, called via HTTP
OPERATOR — Custom Python/Julia functions for data transformation, executed by PyRunner
IMPORT_DDK — Data connector importing data from external GraphQL APIs
EXPORT_DDK — Data connector exporting data to external systems

Model Versioning: The ModelVersionLink join table tracks version chains between models, enabling version conflict detection and upgrade paths.

5. ModelConfigInput and ModelConfigInputField

ModelConfigInput

Defines a parameter or input group for a model.

graphql

type ModelConfigInput @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  modelId: String!
  name: String!
  description: String
  required: Boolean!
  defaultConfigValue: JSON
  inputType: InputType          # INPUT or PARAM
  paramName: String
  inputConfig: JSON
  nativeFieldType: String!
  model: Model! @mapping(...)
  modelConfigInputFields: [ModelConfigInputField]! @mapping(...)
  Compatibility: [Compatibility]! @mapping(...)
}

Field	Type	Description
`inputType`	InputType	`INPUT` (data) or `PARAM` (configuration parameter)
`paramName`	String	Parameter name used in the model's API
`nativeFieldType`	String	Type in the model's native language (e.g., "List[City]", "int")
`inputConfig`	JSON	For IMPORT_DDK, contains GraphQL query configuration
`defaultConfigValue`	JSON	Default value for this input

Input Types:

INPUT — Actual data consumed by the model. Can be files, objects, or primitives. May come from upstream task outputs.
PARAM — Configuration parameters. Typically static values set in configValue. Can be overridden by variables.

ModelConfigInputField

Defines individual fields within an input's data structure, supporting nested hierarchies.

graphql

type ModelConfigInputField @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  modelConfigInputId: String!
  fieldName: String!
  fieldDescription: String
  fieldType: FieldType!
  nativeFieldType: String!
  parentFieldId: String
  defaultValue: String
  required: Boolean!
  modelConfigInput: ModelConfigInput! @mapping(...)
  childrenFields: [ModelConfigInputField]! @mapping(...)
}

Hierarchical Structure Example:

ModelConfigInput: "customer_data"
└── ModelConfigInputField: "name" (STRING)
└── ModelConfigInputField: "age" (INTEGER)
└── ModelConfigInputField: "address" (OBJECT)
    └── ModelConfigInputField: "street" (STRING, parentFieldId = address.id)
    └── ModelConfigInputField: "city" (STRING, parentFieldId = address.id)

FieldType Enum — Comprehensive type system for schema validation:

Category	Types
Scalar	`STRING`, `NUMBER`, `NUMBER_FLOAT`, `NUMBER_DOUBLE`, `INTEGER`, `INTEGER_32`, `INTEGER_64`, `BOOLEAN`, `OBJECT`
Array	`ARRAY`, `ARRAY_STRING`, `ARRAY_NUMBER`, `ARRAY_NUMBER_FLOAT`, `ARRAY_NUMBER_DOUBLE`, `ARRAY_INTEGER`, `ARRAY_INTEGER_32`, `ARRAY_INTEGER_64`, `ARRAY_BOOLEAN`, `ARRAY_OBJECT`
2D Array	`ARRAY2D_STRING`, `ARRAY2D_NUMBER`, `ARRAY2D_NUMBER_FLOAT`, `ARRAY2D_NUMBER_DOUBLE`, `ARRAY2D_INTEGER`, `ARRAY2D_INTEGER_32`, `ARRAY2D_INTEGER_64`, `ARRAY2D_BOOLEAN`, `ARRAY2D_OBJECT`

6. ModelConfigOutput and ModelConfigOutputField

Mirror the input structure but define what a model produces. Same field pattern as ModelConfigInput/ModelConfigInputField with outputConfig replacing inputConfig.

Field	Description
`paramName`	Parameter name in the model's response
`outputConfig`	For EXPORT_DDK, contains external API configuration (resolver URL, mutations)
`modelConfigOutputFields`	Structured field definitions for output data

7. Function

Custom source code executed by the PyRunner executor service.

graphql

type Function @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  functionId: String!
  name: String!
  description: String
  language: Language!         # GOLANG, PYTHON, JULIA
  functionContents: String!
  dtTimestamp: Int!
  isValid: Boolean
  models: [Model]! @mapping(...)
  modelConfigs: [ModelConfig]! @mapping(...)
  workflowTasks: [WorkflowTask]! @mapping(...)
}

Field	Type	Description
`functionId`	String	Secondary identifier for the function
`language`	Language	`PYTHON`, `GOLANG`, or `JULIA`
`functionContents`	String	The actual source code
`dtTimestamp`	Int	Creation timestamp — used for versioned file naming
`isValid`	Boolean	Whether the function passes validation

Usage Pattern:

Function is stored in the database
During DAG generation, function is written to a .py file as {name}_{dtTimestamp}.py
File path and function name are passed to the Python executor (PyRunner)
Executor loads and executes the function

Versioning via Timestamp: The dtTimestamp field ensures immutable function versions. Multiple experiments can use different function versions concurrently.

8. ModelRegistry and ModelRegistryVersion

ModelRegistry

Registry information for containerised model collections (microservices).

graphql

type ModelRegistry @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  ecrRegistry: String!
  ecrRepository: String!
  appPort: String!
  language: Language!
  modelRegistryVersions: [ModelRegistryVersion]! @mapping(...)
}

Field	Description
`ecrRegistry`	AWS ECR registry URL
`ecrRepository`	Repository name within the registry
`appPort`	Port the containerised model listens on
`language`	Programming language of the model

ModelRegistryVersion

graphql

type ModelRegistryVersion @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  modelRegistryId: String!
  ecrTag: String!
  bLatest: Boolean!
  modelRegistry: ModelRegistry! @mapping(...)
  models: [Model]! @mapping(...)
}

Field	Description
`ecrTag`	Docker image tag (e.g., "v1.2.3", "latest")
`bLatest`	Whether this is the latest deployed version

Purpose: Connect models to their Docker container deployments and enable version control for deployed model collections.

9. Compatibility

Defines which outputs can connect to which inputs — a join table for type-safe task connections.

graphql

type Compatibility @required(type:"READ,UPDATE,DELETE", table:"true") {
  modelConfigInputId: String! @constraint(type: "primarykey")
  modelConfigOutputId: String! @constraint(type: "primarykey")
  modelConfigInput: ModelConfigInput! @mapping(...)
  modelConfigOutput: ModelConfigOutput! @mapping(...)
}

Validation Logic:

Only allows connections between type-compatible inputs and outputs
Prevents runtime type errors in workflow dependencies
Enforced during workflow design in the UI
The modifyCompatibilityTable mutation recalculates compatibility entries

10. Workflow Test Execution

A lightweight execution mode for validating workflow logic and dependencies without running the full experiment lifecycle. Workflows have one latest state stored in Redis, updated on each test execution. This allows for rapid iteration during development without creating multiple trials in the database.

The resolvers to retrieve workflow execution output/state/logs are always based on the latest execution, so the architecture is the same as the full experiment execution model, just without trial/task output persistence in the database. This means the UI can switch seamlessly between test execution and full experiment execution modes.

Test execution results are temporary and are never stored in the database. Technically, you can allow multiple concurrent test executions of the same workflow as each run has a unique dag run ID, but the UI only ever renders the latest execution results for simplicity.

Execution Flow Summary:

User triggers test execution via testExecuteWorkflow mutation.
A DAG is generated using the workflow definition, and a unique dag run ID is created.
The WEM executes the workflow, storing execution state, outputs, and logs in Redis under keys namespaced by the dag run ID.
The UI polls for execution results using the workflow ID, rendering the latest state, outputs, and logs from Redis/WEM state.
Results eventually expire in Redis after a configurable TTL (e.g., 1 hour) to prevent stale data.

11. Experiment Execution (Trial System)

The experiment execution model uses a Trial-based approach. See Study and Experiment above for the full Experiment, ExperimentTaskConfig, Trial, TrialTaskValue, and TrialTaskOutput type definitions.

Execution Flow Summary:

ExperimentTaskConfig records define per-experiment config overrides for each task
When executeExperiment is called, a Trial is created (or an existing PENDING trial is used)
Config values are resolved (ExperimentTaskConfig → WorkflowTask → ModelConfig) and snapshotted into TrialTaskValue records
The WEM executes the workflow using the snapshotted config, recording outputs/states/logs into Redis.
Redis data is temporary until saved back to the database as TrialTaskOutput records upon by the user calling saveTrial.
Trial status transitions: PENDING → RUNNING → COMPLETED/FAILED
Trial's can be promoted into a new workflow by using promoteTrial, to persist the configuration and outputs as a new version of the workflow.

Supporting Entities

WorkflowDeployment

Deployment configuration for scheduled or event-driven workflow execution.

graphql

type WorkflowDeployment @required(type:"READ", table:"true") {
  id: ID! @constraint(type: "primarykey")
  name: String!
  description: String
  workflowId: String!
  enabled: Boolean!
  deploymentType: DeploymentType!
  deploymentConfig: JSON
  variables: JSON
  createdAt: Int!
  workflow: Workflow! @mapping(...)
}

Deployment Types:

Type	Description
`ONCE_OFF_SCHEDULE`	Run at a specific unix timestamp
`CRON_SCHEDULE`	Run on a recurring cron expression
`DDK_RESOLVER`	Triggered by default DDK resolver events
`DDK_RESOLVER_CRUD`	Triggered by DDK CRUD events (CREATE/UPDATE/DELETE) with variable mappings
`DDK_RESOLVER_CUSTOM`	Triggered by custom DDK resolver (not yet implemented)
`STREAMING_REDIS_STREAMS`	Run on Redis Streams data (not yet implemented)
`ON_STARTUP`	Run once when the Experiment Manager starts

WorkflowTemplate

Reusable workflow blueprints that can be instantiated as new workflows:

graphql

type WorkflowTemplate @required(type:"READ,DELETE", table:"true") {
  id: ID! @constraint(type: "primarykey")
  name: String! @constraint(type: "unique")
  description: String
  templateValue: JSON
  createdAt: Int!
  updatedAt: Int!
}

ModelVersionLink

Join table tracking version chains between models:

graphql

type ModelVersionLink @required(type:"CREATE,READ,UPDATE,DELETE", table:"true") {
  ModelID: ID! @constraint(type: "primarykey")
  NextVersionModelID: ID! @constraint(type: "primarykey")
}

Used for detecting model version conflicts in workflows and enabling the listWorkflowTaskUpgradeOptions query.

Modelling Library

MDK Data Schema ​

Overview ​

Schema Layers ​

ORM (Object-Relational Mapping) ​

DTO (Data Transfer Objects) ​

Custom (Resolvers) ​

OR Types System ​

Core Request/Response Types ​

ORHttpRequest ​

ORHttpResponse ​

The Three-Type Pattern: Input, Params, Output ​

Input Types ​

Params Types ​

Output Types ​

Params vs Input in Swagger ​

Summary ​

Schema Hierarchy ​

Core Schema Components ​

1. Study and Experiment ​

Study ​

Experiment ​

ExperimentTaskConfig ​

Trial ​

TrialTaskValue ​

TrialTaskOutput ​

2. Workflow and WorkflowTask ​

Workflow ​

WorkflowTask ​

3. WorkflowTaskDependency ​

4. ModelConfig and Model ​

ModelConfig ​

Model ​

5. ModelConfigInput and ModelConfigInputField ​

ModelConfigInput ​

ModelConfigInputField ​

6. ModelConfigOutput and ModelConfigOutputField ​

7. Function ​

8. ModelRegistry and ModelRegistryVersion ​

ModelRegistry ​

ModelRegistryVersion ​

9. Compatibility ​

10. Workflow Test Execution ​

11. Experiment Execution (Trial System) ​

Supporting Entities ​

WorkflowDeployment ​

WorkflowTemplate ​

ModelVersionLink ​

See Also ​

MDK Data Schema

Overview

Schema Layers

ORM (Object-Relational Mapping)

DTO (Data Transfer Objects)

Custom (Resolvers)

OR Types System

Core Request/Response Types

ORHttpRequest

ORHttpResponse

The Three-Type Pattern: Input, Params, Output

Input Types

Params Types

Output Types

Params vs Input in Swagger

Summary

Schema Hierarchy

Core Schema Components

1. Study and Experiment

Study

Experiment

ExperimentTaskConfig

Trial

TrialTaskValue

TrialTaskOutput

2. Workflow and WorkflowTask

Workflow

WorkflowTask

3. WorkflowTaskDependency

4. ModelConfig and Model

ModelConfig

Model

5. ModelConfigInput and ModelConfigInputField

ModelConfigInput

ModelConfigInputField

6. ModelConfigOutput and ModelConfigOutputField

7. Function

8. ModelRegistry and ModelRegistryVersion

ModelRegistry

ModelRegistryVersion

9. Compatibility

10. Workflow Test Execution

11. Experiment Execution (Trial System)

Supporting Entities

WorkflowDeployment

WorkflowTemplate

ModelVersionLink

See Also