Skip to main content
STUB DOCUMENT: This page is intentionally minimal and will be expanded with deeper technical details in a future update.
Alerts notify operators that action is needed. They are displayed prominently in the UI (environment view, twin detail, etc.). Alerts have a category:
  • technical (default): operational or hardware events (e.g. robot stuck, calibration needed).
  • business: business-process events (e.g. order delayed, SLA threshold exceeded).
Examples:
  • A robot needs calibration.
  • A robot got stuck and needs remote takeover.
  • A sensor reading is out of expected range.
  • An order has exceeded its SLA deadline.

Model

Every alert belongs to a workspace and must be attached to at least one of: twin, environment, or workflow.
FieldTypeNotes
namestringHuman-readable title
descriptiontextOptional details
alert_typestringMachine-readable code (e.g. calibration_needed, robot_stuck)
severityenuminfo, warning, error, critical (default: warning)
statusenumactive, acknowledged, resolved, silenced (default: active)
source_typeenumedge, simulation, cloud, workflow (default: edge)
categoryenumtechnical, business (default: technical)

Lifecycle

  • Active: new alert, requires attention.
  • Acknowledged: an operator has seen it but the issue is not yet fixed.
  • Resolved: the root cause has been addressed (by edge device or operator).
  • Silenced: suppressed workspace-wide without resolving the root cause.

Idempotent resolve and silence

The POST /api/v1/alerts/{uuid}/resolve and POST /api/v1/alerts/{uuid}/silence endpoints are idempotent:
  • Resolving an already-resolved alert returns 200 with the alert body (no-op).
  • Resolving a silenced alert is allowed — it transitions the alert to resolved.
  • Silencing an already-silenced alert returns 200 with the alert body (no-op).
  • Silencing a resolved alert returns 400.
This means edge drivers and operator UIs can safely call resolve without checking current status first, avoiding race conditions when multiple actors act on the same alert concurrently.

Workflow-scoped alerts

Alerts produced by a workflow’s send_alert node carry a workflow_uuid, and Alert.workflow is set on the backend. List alerts for a single workflow with:
GET /api/v1/alerts?workflow_uuid={workflow_uuid}
Generated edge workers call client.publish_alert(..., workflow_uuid=WORKFLOW_UUID, workflow_node_uuid=..., workflow_execution_uuid=...). The SDK forwards workflow_uuid as a top-level field (sets the FK) and merges workflow_node_uuid / workflow_execution_uuid into metadata for full provenance.

Source attribution: metadata.source_chain

STUB: Behavior is implemented; this section will be expanded with screenshots and the full kind catalogue.
Every alert raised by a workflow’s send_alert node carries an ordered source_chain under metadata describing each upstream node that fed into the decision. The chain is built generically — any node emitter that opts in by parking a _source_summary on its output dict contributes an entry, so the mechanism works equally well for camera-perception, audio-track, alert-triggered, manual, scheduled, or any future trigger source. Each entry includes:
  • kind — short identifier (camera_frame, audio_track, alert_trigger, manual_trigger, schedule_trigger, call_model, detection_event_gate, conditional).
  • node_uuid — the workflow node that contributed the entry.
  • Kind-specific fields. Examples: twin_uuid + sensor for camera/audio triggers; model_uuid + model_name + a capped detections_sample for call_model; mode + matched_classes + cooldown_seconds for detection_event_gate.
Example payload for a camera_frame → call_model → send_alert workflow:
{
  "metadata": {
    "workflow_uuid": "...",
    "workflow_node_uuid": "...",
    "workflow_execution_uuid": "...",
    "source_chain": [
      {
        "kind": "camera_frame",
        "node_uuid": "...",
        "twin_uuid": "...",
        "sensor": "front",
        "frame_ts": 1746651234.5
      },
      {
        "kind": "call_model",
        "node_uuid": "...",
        "model_uuid": "...",
        "model_name": "YOLOv8n",
        "modality": "image",
        "output_format": "detections",
        "detections_total": 2,
        "detections_sample": [
          { "label": "person", "confidence": 0.91, "bbox_pixels": [12, 34, 56, 78] }
        ]
      }
    ]
  }
}
Notes:
  • The chain is purely additive. Adding _source_summary does not change dedupe_hash (computed over name + description + alert_type + severity + status + twin_uuid), so dedupe behavior is unchanged.
  • detections_sample is capped to keep alert metadata bounded; the original detections_total is preserved.
  • User-supplied static metadata keys always win on conflict — the source chain only fills in metadata['source_chain'] when not already provided.

Edge Core system alerts

STUB: This section will be expanded with the full alert type catalogue.
Edge Core automatically raises technical alerts for operational issues. These are category: technical, source_type: edge unless noted otherwise.
alert_typeSeveritySourceTrigger
driver_start_failureerroredgeDriver container cannot reach a stable running state
driver_restart_looperroredgeDriver restarts too frequently (circuit-breaker tripped)
driver_healthwarningedgeDriver container stopped unexpectedly
model_download_failurewarningedgeRequired ML model could not be downloaded
worker_start_failurewarningedgeWorker container failed to start
edge_core_restartinfocloudBackend has accepted an edge-core restart request and is tracking the lifecycle (see below)

edge_core_restart lifecycle alert

POST /api/v1/edges/{uuid}/restart-core creates a single edge_core_restart alert that tracks the restart end-to-end. The alert is scoped to the environment of the first bound twin (alerts always need a twin / environment / workflow anchor), with source_type: cloud because the request originates from the backend, not from the edge. The current phase is carried on metadata.phase:
PhaseWritten byMeaning
requestedBackend, on accepting the API callMQTT restart command queued
in_progressEdge-core, when it picks the command off MQTTRestart actually running
completedEdge-core, after a successful relaunchHappy path; alert is resolved
failedEdge-core, if _perform_edge_core_restart raisesRestart attempt failed; alert resolved with phase: failed
timed_outBackend reaper (reap_stuck_edge_core_restart_alerts)Alert stuck in requested / in_progress past 5 min — typically the MQTT command never landed or edge-core crashed mid-restart
The same alert_uuid is returned in the API response and included in the MQTT command payload, so edge-core can transition the alert without a lookup.

Restart-driven pre-resolution

When restart-core is accepted, the backend also pre-resolves any active alerts on the requesting edge’s bound twins whose root cause a clean container relaunch genuinely fixes:
  • driver_start_failure
  • driver_restart_loop
  • worker_start_failure
This keeps the workbench from carrying stale failure noise that the operator’s restart just made irrelevant. Each pre-resolved alert is annotated with:
{
  "metadata": {
    "resolved_by_restart_request_id": "<request_id>",
    "resolved_by_restart_alert_uuid": "<edge_core_restart uuid>"
  }
}
and the edge_core_restart alert’s own metadata gets pre_resolved_alert_uuids: [...] so the audit trail links both ways. Other alert types are deliberately not pre-resolved (a restart doesn’t actually fix them, and silently closing them would lie to the operator). The authoritative allow-list and excluded set live on EDGE_CORE_RESTART_RESOLVABLE_ALERT_TYPES — change the constant and this section together. The response schema is EdgeCoreRestartResponseSchema. Note that alert_uuid is null when no environment can be resolved for the edge (typically: no bound twin yet); the restart still happens, just untracked.

Frontend display contract

Read this before adding a new alert_type. Every alert that the backend or an edge driver can raise must either render correctly with the generic alert card, or have a documented specialised renderer linked from this section. Skipping this step is how we end up with overlapping alert types that nobody can untangle six months later.
The workbench renders every alert through a single AlertCard (cyberwave-frontend/components/alerts/alert.tsx). The generic path uses only these fields and needs nothing else from the producer:
FieldDisplay
severityLeft border colour + leading icon (info / warning / error / critical — see alert-display.ts)
statusStatus pill (Active / Acknowledged / Resolved / Silenced). resolved and silenced also dim the card to 60% opacity.
source_typeSmall outline badge
alert_typeSmall mono-font outline badge — this is what tells the operator which producer raised the alert, so make it specific and stable.
nameCard title
descriptionBody text. URLs are auto-linked.
mediaInline image (.png/.gif) or autoplaying muted video (.mp4).
created_atRelative timestamp with absolute on hover.
metadata.buttonsGeneric action buttons that publish back on cyberwave/twin/{twin_uuid}/command as a button command — see Metadata buttons.
The new edge_core_restart alert deliberately uses only the generic path. The lifecycle is encoded in metadata.phase, so any UI that wants to surface “Restart in progress vs. completed vs. failed” can read that field without a dedicated component, and the audit trail (request_id, pre_resolved_alert_uuids, previous_phase, timed_out_at) is plain JSON. Specialised renderers exist for a handful of historical alert types that need bespoke interactions; each one is a known cost, not a pattern to copy:
alert_typeSpecialised behaviour
robot_setupSpinner replaces the severity icon while status is active / acknowledged
robot_setup_doneGreen success panel with a check mark and amber/green accent
calibration_neededInline Next / Complete / Restart calibration actions that publish to cyberwave/twin/{twin_uuid}/command instead of using metadata buttons
camera_default_deviceDefault is fine action that writes the chosen device into twin metadata and resolves the alert
driver_startingPhase-aware spinner with rewritten copy (Downloading driver image… / Installing driver image… / Starting driver container…) driven by metadata.phase, plus a byte/percent suffix when the pull is mid-flight — see driver_starting progress metadata

Adding a new alert_type

Before you introduce a new alert_type:
  1. Check this page. If an existing type fits — even loosely — extend its metadata instead of forking a new code.
  2. Default to the generic path. Pick a sensible severity, write a clear name + description, and put any structured state on metadata. The generic card will render it correctly.
  3. Only add a specialised renderer when the interaction itself is novel (e.g. a calibration step that needs custom MQTT commands). Generic metadata.buttons cover most operator-confirms-something flows without new code.
  4. Document the type in this page before merging — both the catalogue row above and, if specialised, the table in this section. A new alert_type constant that does not have a row here should not pass review.

driver_starting progress metadata

Edge-core writes byte-aggregated docker pull progress directly onto the active driver_starting alert’s metadata, so the workbench renders a live "Downloading driver image (cyberwaveos/ugv-driver:dev) — 745 MB of 1.55 GB (47%)" line without any extra round-trip. The fields are:
FieldTypeMeaning
phasestringLifecycle marker. Pull walks pull_started → downloading → installing → pull_complete → pull_stream_finished, then container_starting → driver_running once the container is up. Failure phases are pull_spawn_failed / pull_timed_out / pull_exit_error.
imagestringThe driver image being pulled.
progress_percentint (0–100)Integer percent of downloaded_bytes / total_bytes. 0 until the first docker pull byte-bar lands; 100 at pull_complete.
downloaded_bytes / total_bytesintRaw byte totals aggregated across all layers.
downloaded_human / total_humanstring | nullSame values rendered in SI units ("745 MB", "1.55 GB"). null before docker emits any byte-bar (e.g. when every layer is Already exists).
layers_total / layers_completeintNumber of layers seen so far and number with Pull complete. Only used to caption the brief installing phase between the last byte landing and Status: Downloaded ….
last_docker_pull_linestringMost recent raw docker pull line (truncated to 500 chars), kept for debugging.
The frontend renderer is in cyberwave-frontend/components/alerts/alert.tsx — search for getDriverStartingDisplayText.

MQTT

Edge devices create and resolve alerts via MQTT. Topic pattern:
cyberwave/twin/{twin_uuid}/alert