Skip to content

Data Archiver

Platform Users — Engineers & Low-code Ops Users (ORA / Panel Builder) OR Platform ORA — AI Planning Interface Agent Workflows Plan Visualisation ADK Integration SDK UI — Frontend Shell FDK Architecture Low code Config-driven DDK Schema Definition Code Generator Generated Server MDK WEM DAL Experiment Manager Nexus Deployment Control Live Monitoring Registry Browser SCDK Source Control Pipeline Mgmt Azure DevOps deploys ↓ SDK API — GraphQL Federation Gateway Federation Gateway Component Resolvers Auth & Licensing Plugins: gql-autogeneration Migrator Helm KinD Boilerplate GenAI ··· Microservices — Domain IP Services Data Pipeline Core Platform Metrics & Analytics Spatial & Geo Simulation Event Detection Camera & Device Fire & Resource Opt. Satellite Modelling ↓ Nexus deploys Deployed OR Applications Rail Ops Dashboard Mine Mgmt Dashboard Port Ops Dashboard ··· FDK-built · DDK-backed · MDK-powered · deployed via Nexus ↑ Application Users — Operations Teams (shift managers, analysts, planners)

Overview

Data Archiver manages the final stage of the OR platform's data lifecycle — moving aged operational data out of PostgreSQL and into long-term cold storage on S3. As the Data Recorder continuously writes live data snapshots to PostgreSQL, the database would grow indefinitely without active management. Data Archiver prevents this buildup by identifying data older than a configured threshold, exporting it as CSV files to S3, and then removing it from the database.

This autonomous lifecycle management ensures that PostgreSQL queries remain fast and responsive for the real-time operations that depend on them, while no data is lost — it is simply moved to a more cost-effective storage tier. The archived CSV files on S3 remain available for historical analysis, compliance, and audit purposes.

Data Archiver operates independently of most other microservices, requiring only the Experiment Manager (GraphQL) for session triggers and database access.

Architecture

The service operates as a dedicated archival system for managing the platform's historical data lifecycle.

Key Components

  • Session-triggered archiving — Subscribes to session updates from the central orchestration service. Archiving runs on a 24-hour session cycle.
  • Configurable age threshold — Data older than a specified duration (e.g. one week) is identified for archiving.
  • Cloud storage export — Aged data is exported and stored in cloud storage before being removed from the primary database.
  • Table-by-table processing — Iterates through all configured data tables, processing each independently.

Connections

DirectionServicePurpose
InCentral Orchestration ServiceSession subscription triggers
OutCentral Orchestration ServiceQueries and completion notices
OutPrimary DatabaseQuery and delete aged data
OutCloud StorageStore archived data files

Data Flow

The Session Manager triggers the Data Archiver on a 24-hour cycle through the central orchestration service. For each configured data table, the archiver queries for data older than the threshold, exports it to cloud storage, and removes it from the primary database. Upon completion, it posts a notice to the orchestration service to trigger downstream processes like baseline creation.

Sequence of Operations

  1. Service starts — Initializes health monitoring and session subscription
  2. Subscription begins — Subscribes to session updates from configuration
  3. Session tick (24 hours) — For each configured table:
    • Query database for aged data
    • Export data and upload to cloud storage
    • Remove archived data from primary database
  4. Completion notice — Posts archive completion (triggers downstream processes like baseline creation)

Archived Data Types

Data Archiver processes the following operational data:

Data TypeContent
Segment metricsSegment-level metric snapshots
Way metricsWay-level speed, volume, flow snapshots
EventsIncident and event records
Camera statusCamera status snapshots
Intersection signalsTraffic signal data
Intersection groupsGrouped intersection data
Sensor readingsSensor measurement data
Sign displaysVariable message sign display states

Configuration

Archiving behaviour is controlled through configuration:

  • Cloud storage destination — Target location for archived data files
  • Session subscription — Which session to listen to for triggering the archive cycle
  • Age threshold — How old data must be before it qualifies for archiving
  • Data Recorder — Upstream service that writes the live data snapshots that Data Archiver eventually archives
  • Data Loader — Loads the reference data for tables that Data Archiver manages
  • Experiment Manager — Central coordination service providing session triggers and completion signalling

User documentation for Optimal Reality