Datasources
Datasources are the ingestion layer of SolutionEngine.
They connect external systems to workflows and convert incoming events into executable workflow input.
Without a datasource, production workflows have no live input path.
In production, treat every datasource as an operational service. It must be configured, monitored, and validated like any other runtime component.
Role in the Execution Pipeline
A datasource is responsible for four things:
- Listening to an external source continuously or at intervals
- Validating or normalizing incoming payloads
- Emitting events to workflow runtime
- Preserving project isolation for every emitted event
End-to-end flow:
External Source -> Datasource -> Workflow Trigger Node -> Processing Graph -> Output
Scope and Ownership
Datasource resources are project-scoped.
Key rules:
- A datasource belongs to one project
- It can trigger one or multiple workflows within that same project
- It cannot trigger workflows in other projects
- Its runtime status is tracked independently
Typical statuses:
inactiveactiveerror
Datasource Configuration Model
All datasource types follow the same design contract:
- Connection details: endpoint, topic, URL, path, broker
- Access details: auth, tokens, credentials, secrets
- Input behavior: polling interval, subscription mode, frame/sample rate
- Recovery behavior: reconnect delay, retries, timeout handling
Even when protocol details differ, operational behavior remains consistent: receive event, create workflow input, trigger execution.
Core Datasource Types
HTTP Webhook
Push-based ingestion from external services.
Use when:
- Third-party systems send event callbacks
- You need near real-time API-to-workflow triggering
Common config:
- endpoint path
- authentication method
- payload format expectations
Operational notes:
- Validate request shape early
- Add signature/token validation for public endpoints
HTTP Polling (HTTP GET)
Interval-based pull from an API endpoint.
Use when:
- Source system cannot push events
- You need periodic synchronization
Common config:
- URL
- poll interval
- headers and auth
Operational notes:
- Avoid aggressive polling in production
- Include timeout and retry controls
HTTP Event Stream (SSE)
Persistent event stream ingestion.
Use when:
- Source provides server-sent events
- You need long-lived low-latency event intake
Common config:
- stream URL
- auth mode
- reconnect delay
Operational notes:
- Always configure reconnect behavior
- Log stream disconnect reasons
MQTT Topic
Low-overhead pub/sub ingestion for IoT and telemetry.
Use when:
- Devices publish small, frequent payloads
- You need topic-based routing
Common config:
- broker URL
- topic
- QoS
- client credentials
Operational notes:
- Use deterministic topic naming
- Separate production and test topics
RTSP Camera Stream
Video ingestion from cameras and NVRs.
Use when:
- Building computer vision workflows
- Streaming live visual data
Common config:
- RTSP URL
- FPS sampling
- optional camera credentials
Operational notes:
- Start with lower FPS to control compute cost
- Persist critical frames to buckets for incident replay
File Stream
Replays stored media as event input.
Use when:
- Testing with historical footage
- Repeatable model validation
Common config:
- bucket/path reference
- playback FPS
- looping options
Operational notes:
- Keep benchmark datasets immutable
- Use fixed playback settings for regression tests
Selecting the Right Datasource
Use this decision guide:
- Need push events from apps: HTTP Webhook
- Need periodic sync from APIs: HTTP Polling
- Need real-time IoT feed: MQTT
- Need camera analytics: RTSP
- Need replay/testing: File Stream
- Need long-lived event feed: HTTP Event Stream
Runtime Reliability Guidelines
To run datasources safely in production:
- Define timeout values explicitly
- Configure retry/reconnect strategy
- Validate incoming payload schema near ingress
- Capture critical ingestion errors in logs
- Monitor datasource status and event throughput
Do not move malformed payload handling deep into the workflow. Reject or normalize as close to ingress as possible.
Security Guidelines
- Never hardcode secrets in datasource config
- Prefer managed secrets or environment variables
- Restrict webhook exposure and apply auth validation
- Use least-privilege credentials for brokers and APIs
- Rotate credentials on a schedule
Performance and Cost Controls
- Use sampling for high-volume RTSP streams
- Filter early in trigger nodes to reduce unnecessary execution
- Avoid over-polling external APIs
- Use edge environments for latency-sensitive camera ingestion
- Benchmark throughput before scaling to full production traffic
Troubleshooting Checklist
If datasource status is error, validate in this order:
- Connectivity: host, port, DNS, firewall
- Auth: token validity, username/password, permissions
- Payload: valid JSON/object structure
- Rate and timeout settings
- Workflow trigger mapping and environment health
Common symptoms:
- No events: wrong endpoint/topic/path or auth failure
- Intermittent events: unstable network or reconnect misconfiguration
- High runtime load: over-sampling, over-polling, missing filtering
Production Pattern Example
RTSP Datasource -> Datasource Trigger -> Pre-Validation -> Run Model
-> Confidence Filter -> Save Media -> Alert API
Why this pattern is stable:
- Ingress is isolated
- Validation protects downstream nodes
- Filtering reduces false positives and cost
- Output path supports traceability and alerting
Related Pages
Datasource Reference Pages
Use these pages for field-by-field setup guidance:
