Appearance
Data Stream Ingestion
Overview
Data Stream Ingestion handles event-driven, real-time data feeds that cannot be served effectively through polling. Where Data Ingestion periodically requests data from HTTP endpoints, Data Stream Ingestion subscribes to streaming sources — receiving messages as they are published with minimal latency. This is essential for the OR platform's real-time situational awareness capability, ensuring that operators see traffic conditions as they happen rather than on a polling delay.
The service connects to message queuing topics on the client's enterprise integration platform, subscribing to data from traffic management systems and forwarding it directly to the Data Transformer. By receiving published messages rather than polling, the platform achieves very low latency on event-driven data and avoids placing unnecessary load on upstream endpoints.
Data Stream Ingestion is designed to be extensible. It can be extended to support various pub-sub and data stream formats from internal, third-party, or external services — making it a flexible component in OR's ingestion layer.
Architecture
The Data Stream Ingestion service operates as a dedicated message subscription system that connects to enterprise integration platforms and forwards streaming data to the platform's transformation layer. It uses encrypted message protocol connections for secure data transfer.
Key Capabilities
- Message connection handler — Handles connections to message queuing topics with performance monitoring instrumentation.
- Per-topic processes — Each topic connection runs in its own process, isolating data sources and preventing one slow topic from blocking others.
- Configuration-driven — Defines the message protocol endpoint and the set of topic connections to subscribe to. Configurations are environment-specific and managed through configuration files.
- Health monitoring — Provides health check endpoints for deployment orchestration and monitoring systems.
- Secure authentication — Uses certificate-based authentication with encrypted connections for secure data transfer.
Data Flow
The client enterprise integration platform publishes messages to various topics. Data Stream Ingestion subscribes to multiple topics (such as traffic detector data, speed readings, and device status updates), with each topic running in its own isolated process. Retrieved messages are forwarded to the Data Transformer for normalization and spatial tagging.
Sources
Data Stream Ingestion subscribes to multiple message topics from the client enterprise integration platform:
| Source | Data Provided | Notes |
|---|---|---|
| Traffic Signal Detector Counts | Devices, Ways (traffic volume, flow) | Volume updated every 5 minutes per detector, standardized to 20-second intervals |
| Traffic Monitoring - Speed and Volume | Ways (speed, occupancy, volume, flow, productivity) | Detectors mapped to closest road; updated every 20 seconds |
| Traffic Monitoring - Travel Time | Ways (speed, travel time) | Mapping shared with speed and volume; instantaneous measures |
| Traffic Information System | Environment Objects (variable message sign data) | Variable message sign content |
| Freeway Control System | Environment Objects (variable message sign data) | Freeway control system signs |
Configuration
The service is configured through a file that specifies:
- Message protocol endpoint — The connection string for the enterprise integration platform
- Topic connections — A list of topic subscriptions, each running as an independent process
Each topic handler subscribes to its data source and passes received messages through to the Data Transformer as raw or batched data.
Related Services
- Data Transformer — Downstream consumer that normalises streamed data
- Data Ingestion — Sibling ingestion service for HTTP/FTP polling sources
- Data Redis Ingestion — Sibling ingestion service for real-time data sources
- Data Fusion — Fuses transformed data from multiple sources
- Experiment Manager — Central orchestration service
