Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cyberwave.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Model Playground provides three API surfaces for working with model inference:
MethodEndpointAuthDescription
POST/api/v1/mlmodels/{uuid}/runRun the model synchronously or queue an async workload
POST/api/v1/mlmodels/{uuid}/evaluateStart a benchmark evaluation run
GET/api/v1/mlmodels/{uuid}/weightsGet a signed URL for the model’s checkpoint weights
GET/api/v1/mlmodels/structured-actionsList available structured task IDs (public)
GET/api/v1/mlmodels/edge-runtimesList well-known edge runtime options (public)

POST /api/v1/mlmodels/{uuid}/run

Run a model interactively. Used by the in-app playground and directly from the SDK. Sync path (200): Direct provider call (Google GenAI, OpenAI, or a custom HTTP endpoint). Async path (202): Used when a cloud-node workload is required (e.g. im2mesh, audio). Returns a workload_uuid to poll.
Edge-only models (is_edge_compatible && !is_cloud_compatible) are rejected with 400 from this endpoint. Use the Python SDK (cw.models.load(entry).predict(...)) or the cyberwave model bind CLI instead.

Request body — MLModelRunSchema

At least one of prompt, image_base64, or image_url must be provided for most model types.

Core inputs

FieldTypeDescription
promptstringText prompt
image_base64stringBase64-encoded image (data URI or raw)
image_urlstringPublicly accessible image URL
audio_base64stringBase64-encoded audio (audio runs are async)
audio_urlstringPublicly accessible audio URL
languagestringLanguage hint for speech/translation tasks
taskstringGeneric task hint passed to the provider
structured_taskstringStructured task ID from /mlmodels/structured-actions
twin_uuidstringWhen set, backend auto-pulls the twin’s latest state for VLA/IK models
paramsobjectProvider-specific parameters (e.g. temperature, max_tokens)

Extended perception envelope (optional)

These fields let runners build richer spatial context without a second API revision:
FieldTypeDescription
framesMLModelFrameSchema[]Ordered extra frames (wrist + overhead views, short clips). image_base64/image_url is frame 0.
depth_base64string16-bit PNG depth map aligned to the primary frame
camera_intrinsicsCameraIntrinsicsSchemaPinhole intrinsics (fx, fy, cx, cy, width, height)
camera_poseCameraPoseSchemaWorld pose of the primary camera (position + quaternion)
historyHistoryTurnSchema[]Prior conversation/context turns for stateful perception
robot_stateRobotStateSchemaRobot joint state + gripper state. Server auto-fills from twin_uuid when omitted.
robot_contextRobotContextSchemaEmbodiment context (schema, capabilities summary)

MLModelFrameSchema

FieldTypeDescription
image_base64stringBase64 image
image_urlstringImage URL
timestampintMonotonic ms offset from first frame
camera_idstringCamera label, e.g. "wrist", "overhead"

Response — 200 OK (sync)

{
  "status": "completed",
  "output_format": "json",
  "output": [
    { "label": "box", "confidence": 0.94, "bbox": [120, 45, 380, 290] }
  ],
  "raw": "...",
  "actions": null
}

MLModelRunResultSchema

FieldTypeDescription
statusstringAlways "completed"
output_formatstringSee Output formats below
outputanyParsed model output — type depends on output_format
rawstring | nullRaw provider response (for debugging)
actionsMotionEpisodeSchema | nullServer-resolved motion episode when the runner computed it directly (VLA / spatial reasoners). null for bare perception responses.

Response — 202 Accepted (async / cloud-node)

{
  "status": "queued",
  "workload_uuid": "w1b2c3d4-...",
  "poll_url": "/api/v1/cloud-node-workloads/w1b2c3d4-..."
}
Poll GET /api/v1/cloud-node-workloads/{workload_uuid} until status == "completed". The result payload lands in workload.command_params.result.

Response codes

CodeMeaning
200Inference completed — response body is MLModelRunResultSchema
202Workload queued — poll poll_url
400Edge-only model, bad payload, or provider error
403Model not accessible or feature flag not enabled

Output formats

The output_format field in the run result tells you how to interpret output:
output_formatoutput typeDescription
textstringPlain text response
jsonobjectArbitrary structured JSON
imagestringData URL (data:image/...;base64,...)
points[[y, x], …]Normalized 0-1000 point coordinates (Gemini ER / spatial tasks)
boxes[[y1,x1,y2,x2], …]Normalized 0-1000 bounding boxes
masksobject[]Segmentation masks
detections_3dobject[]3D bounding boxes with position + orientation
graspsobject[]Grasp candidates (position, orientation, width)
trajectoryobject[]Planned end-effector trajectory
plan_stepsobject[]High-level plan steps
relationsobject[]Detected spatial relations between objects
meshstringGLB URL (for im2mesh models)
motion_episodeobjectRecorded or generated motion episode
rawstringUnprocessed provider output

Example: VLM text response

{
  "status": "completed",
  "output_format": "text",
  "output": "The robot arm is above the red cube."
}

Example: Spatial reasoner — points

{
  "status": "completed",
  "output_format": "points",
  "output": [
    [312, 450],
    [680, 210]
  ]
}
Points are [y, x] normalized to 0-1000, where [0, 0] is the top-left corner.

Example: Detections (structured_task = "detect_objects")

{
  "status": "completed",
  "output_format": "json",
  "output": [
    { "label": "bottle", "confidence": 0.92, "bbox": [50, 120, 200, 480] },
    { "label": "cup", "confidence": 0.88, "bbox": [310, 95, 420, 360] }
  ]
}

POST /api/v1/mlmodels/{uuid}/evaluate

Start an asynchronous benchmark evaluation. Returns 202 Accepted immediately with a poll URL.

Request body — MLModelEvaluateSchema

FieldTypeDefaultDescription
benchmark_slugstring"perception-smoke-v1"Benchmark suite to run against

Response — 202 Accepted

{
  "status": "queued",
  "workload_uuid": "e1f2a3b4-...",
  "poll_url": "/api/v1/cloud-node-workloads/e1f2a3b4-..."
}
Poll GET /api/v1/cloud-node-workloads/{uuid} until status == "completed". Results are in workload.command_params.result — aggregate pass rate and per-case breakdown. To list available benchmark suites: GET /api/v1/mlmodels/benchmark-suites (public, no auth).

GET /api/v1/mlmodels/{uuid}/weights

Get a signed download URL for the model’s checkpoint weights (tar archive containing config.json + adapter files from training).

Response — 200 OK

{
  "signed_url": "https://storage.googleapis.com/...?Expires=...",
  "expires_at": "2026-05-14T22:00:00+00:00",
  "checkpoint_path": "models/a1b2c3d4/.../checkpoint.tar"
}
The signed URL is valid for 2 hours.

Response codes

CodeMeaning
200Signed URL generated
404Model has no checkpoint, or checkpoint not found in storage

GET /api/v1/mlmodels/structured-actions

Return the canonical catalog of structured_task values understood by the playground. No authentication required. Consumed by the frontend and Python SDK.
{
  "detect_objects": {
    "label": "Detect objects",
    "description": "...",
    "output_format": "json"
  },
  "detect_points": { "..." },
  "caption": { "..." }
}

Python SDK

Two patterns map to this API: 1 — Playground handle (POST /api/v1/mlmodels/{uuid}/run with full request control):
from cyberwave import Cyberwave

cw = Cyberwave()
handle = cw.models.playground("acme/models/gemini-robotics-er")

# Returns MLModelRunResultSchema or MLModelRunQueuedSchema (not PredictionResult)
result = handle.run(
    image="scene.jpg",
    prompt="Point to the red cube",
    structured_task="detect_points",
)
2 — Unified load() + predict() (routes cloud slugs through the playground-backed inference path where applicable; edge weights run locally):
from cyberwave import Cyberwave
from PIL import Image

cw = Cyberwave()
img = Image.open("frame.jpg").convert("RGB")

entry = cw.models.list(deployment="cloud")[0]
model = cw.models.load(entry)  # or cw.models.load("acme/models/…")
pred = model.predict(img, confidence=0.25)
print(pred.describe())
For edge-only weights, load().predict() runs on-device; playground(...).run() is for the HTTP playground contract (prompts, structured tasks, async workloads).

cURL examples

Text-only VLM run

curl -X POST "$CYBERWAVE_API_URL/api/v1/mlmodels/$MODEL_UUID/run" \
  -H "Authorization: Bearer $CYBERWAVE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What objects do you see on the workbench?"}'

Vision run with image URL

curl -X POST "$CYBERWAVE_API_URL/api/v1/mlmodels/$MODEL_UUID/run" \
  -H "Authorization: Bearer $CYBERWAVE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Detect all boxes",
    "image_url": "https://example.com/workspace_frame.jpg",
    "structured_task": "detect_objects"
  }'

Spatial reasoning run (points)

curl -X POST "$CYBERWAVE_API_URL/api/v1/mlmodels/$MODEL_UUID/run" \
  -H "Authorization: Bearer $CYBERWAVE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Point to the red cube",
    "image_url": "https://example.com/scene.jpg",
    "structured_task": "detect_points"
  }'

Model Catalog API

Browse and manage model records

Python SDK — ML Models

SDK reference for catalog + runtime

Structured Actions

All available structured task IDs

Edge Workers

Run edge models on hardware