Appearance
Deployment (Nexus)
Nexus is the infrastructure management interface for the OR platform. It brings Kubernetes deployment, live monitoring, and container registry access into the OR UI — so teams can deploy model containers, watch services become healthy, and debug running applications without leaving the environment where they are building.
Who is this for?
- Platform engineers managing OR service deployments across local KinD and AWS EKS clusters
- Modelling teams who need to deploy a model container to the cluster before testing it in an MDK workflow
- Any OR developer who needs to monitor a running service, read live logs, or configure a deployment's environment variables and replica settings
- Infrastructure engineers responsible for multi-environment OR installations (local development → staging → production)
Platform Context
Why This System Exists
The Optimal Reality platform runs many services simultaneously: MDK model microservices, DDK data servers, FDK frontend applications, AI agents, and platform infrastructure. Getting any of these services running — and keeping them running — requires pulling container images from a registry, configuring environment variables and replica counts, monitoring pod health, reading logs, and responding to Kubernetes events.
Without a unified interface, this work happens outside the platform: developers context-switch to kubectl, Helm CLI, and cloud consoles for every deployment and debugging session. For teams whose primary goal is building and running models or applications, this is friction that interrupts the flow of development.
Nexus brings the deployment and operations layer into the OR platform UI. Teams can deploy a new model container, configure its environment, check whether it is healthy, and stream its logs — all without leaving the environment where they are building the workflow that will call that model.
The decision to build Nexus rather than integrating with an existing Kubernetes management tool warrants explanation. ArgoCD, Rancher, K9s, and Lens are all capable Kubernetes management environments. They are also built for platform engineers: people whose primary job is managing Kubernetes infrastructure, who are comfortable with Helm values, kubectl contexts, and Kubernetes resource YAML. OR's domain teams — the people building MDK workflows and FDK applications — are not, in general, platform engineers. For them, the question is not "what is the current replica count for this deployment?" but "is this model running so I can test my workflow?" Nexus frames Kubernetes operations around the OR domain: deploying a model before running a workflow, watching a container become healthy before triggering an experiment, reading live logs to debug a model that is failing in the pipeline. The same operations exist in ArgoCD; the context is different. Keeping deployment operations in the same environment as model development reduces the number of tools, contexts, and mental models a domain team needs to maintain.
What this enables in practice:
A modelling team building an MDK workflow that needs a new model deployed can trigger the deployment from within the platform, watch the container become healthy in Nexus, and immediately test the workflow — without switching tools.
A platform operator monitoring a running OR installation can see the full service topology, current CPU and memory usage across all deployments, Kubernetes events, and live logs from any container — in one place.
A developer configuring a deployment across environments (local KinD for development, AWS EKS for staging or production) can manage per-environment settings — image versions, replica counts, environment variables, ingress configuration — through Nexus without editing configuration files directly.
Core Concepts and Principles
Every service lives within a structured deployment hierarchy
Nexus organises all deployable services within a four-level hierarchy: Project → Environment → Namespace → Application. Each level holds its own configuration, and settings at lower levels override those above. The same application can run in a local KinD cluster for development and an AWS EKS cluster for production — with different image tags, replica counts, and environment variables at each level — all managed through a single interface. This structure is what makes consistent multi-environment deployment manageable without per-environment scripts or manual Helm editing.
The four-level hierarchy is not the only way to organise multi-environment configuration — Helm values files, Kustomize overlays, and environment-specific YAML are all viable approaches. We chose a structured hierarchy stored in the OR platform because the deployment configuration then becomes a first-class platform entity, not a collection of files in a git repository that only platform engineers can interpret. The hierarchy also maps naturally to how OR projects are organized: a project contains multiple environments (local, staging, production), each environment contains namespaces by concern (MDK, DDK, FDK, infrastructure), and each namespace contains the applications that run there. A team member who understands the project structure can navigate to any application's deployment configuration through the same mental model they use to navigate the platform itself.
One interface for both local and remote Kubernetes
Nexus supports both local KinD (Kubernetes in Docker) clusters used during development and remote Kubernetes clusters (AWS EKS) used for staging and production. The connection to each cluster is handled automatically based on the environment context — developers using Nexus do not need to manage kubeconfig files or know which cluster type they are targeting. The same deployment, monitoring, and log streaming operations work identically in both environments.
Live cluster state, visible in real time
Nexus polls the Kubernetes API continuously to surface real-time deployment state: whether pods are ready, how many replicas are running, current CPU and memory utilisation per container, and Kubernetes events as they occur. Developers can watch a deployment become healthy in real time within the platform, rather than running repeated terminal commands. Live log streaming adds the ability to tail container output directly from the UI.
The service dependency graph as the central view
Nexus renders the full service topology as an interactive dependency graph — showing each service as a node and connections as dependencies, with live health state overlaid. At a glance it is visible which services are healthy, which are degraded, and how failures might propagate through dependent services. This makes the operational state of the platform immediately understandable.
Making the dependency graph the primary view rather than a list of deployments is a deliberate choice. A list of pods with CPU and memory metrics is the right view for a platform engineer doing capacity planning. For a domain team who wants to know "is the model my workflow depends on actually running, and if not, what else is affected?" the graph view answers the question immediately. In OR, services have semantic relationships: a workflow that calls Model A depends on Model A's container being healthy; if that container depends on a shared data service, that service's health matters too. The topology makes these relationships visible in a way that a list cannot.
Role in the Optimal Reality Platform
Nexus occupies the infrastructure management layer of the Optimal Reality platform. It is the bridge between the platform's application layer and the underlying Kubernetes compute infrastructure.
Nexus is responsible for:
- Container lifecycle management — deploying, scaling, and monitoring containerised services in KinD and remote Kubernetes clusters
- Container registry access — browsing available image versions from local Docker and remote AWS ECR
- Live cluster observability — real-time CPU, memory, replica, and pod status across all deployed services
- Deployment configuration — managing per-namespace application settings: image versions, replica counts, environment variables, and ingress rules
Nexus interacts with:
- MDK — when users run MDK workflows, Nexus handles the deployment of model containers the WEM needs to invoke
- FDK — FDK-built applications are containerised and managed through Nexus
- DDK — DDK-generated data servers are deployable units visible and configurable in Nexus
- SCDK — CI/CD pipelines monitored in the SCDK produce the container images that Nexus deploys
How It Fits in the End-to-End Platform
Applications (MDK model microservices, FDK apps, DDK servers, system services)
↓
Nexus (deployment, configuration, live monitoring)
↓
Kubernetes (KinD for local development / EKS for staging and production)
↓
Container Images (AWS ECR + local Docker registry)
↑
CI/CD pipelines (monitored in SCDK) produce images that Nexus deploysNexus sits below the application layer and above the raw infrastructure. It is the operational control plane for the running OR platform — the interface through which the abstract service topology becomes concrete running containers, and through which the health of a live deployment is visible and manageable.
Documentation
- Remote Environments (EKS) — EKS connectivity via SSM, AWS account resolution, and troubleshooting
Next Steps
Understand Nexus in the platform context:
- MDK — model containers must be deployed to the cluster via Nexus before MDK workflows can invoke them
- DDK — DDK-generated data servers are deployable units visible and configurable in Nexus
- SCDK — CI/CD pipelines monitored in SCDK produce the container images that Nexus deploys
- SDK API Nexus Component — how the SDK API exposes Nexus capabilities
