> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyberwave.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Speech to Text

> Transcribe robot audio with Whisper MLModels in workflows, REST calls, or edge workers

stub

Cyberwave exposes speech-to-text as an `MLModel`, not as a ROS intelligence layer. Audio goes in, a JSON transcript comes out, and the same model can be called from `POST /api/v1/mlmodels/{uuid}/run` or a workflow `CALL_MODEL` node.

The default catalog model is Whisper Large v3:

```json theme={null}
{
  "audio_url": "https://example.com/audio.wav",
  "language": "auto",
  "task": "transcribe"
}
```

The Whisper Cloud Node returns:

```json theme={null}
{
  "text": "...",
  "segments": [],
  "language": "en"
}
```

Use this when a microphone driver uploads audio and a workflow needs to pass the transcript into a downstream controller policy, planner, or human-in-the-loop step.