call_model) nodes that select an edge or hybrid speech-to-text catalog model run inference on-device via cw.models.load(...).predict(...). Cloud-only STT uses client.mlmodels.run(...) and does not require edge STT packages.
Prompting for STT models is optional biasing — see Prompting Call Model.
Catalog models and dependencies
Each STT seed sets one SDK extra inmetadata.edge_package. Edge-sync surfaces it in model_requirements.
| Model | Runtime | Edge extra | Install |
|---|---|---|---|
| Whisper Tiny EN Q5_1 | whisper_cpp | ml-stt | pip install 'cyberwave[ml-stt]' |
| Whisper Base EN Q5_1 | whisper_cpp | ml-stt | same |
| Whisper Small EN Q5_1 | whisper_cpp | ml-stt | same |
| Whisper Tiny Multilingual Q5_1 | whisper_cpp | ml-stt | same (+ cloud fallback) |
| Whisper Base Multilingual Q5_1 | whisper_cpp | ml-stt | same (+ cloud fallback) |
| Faster Whisper Tiny EN | faster_whisper | ml-stt-faster | pip install 'cyberwave[ml-stt-faster]' |
| Faster Whisper Base EN | faster_whisper | ml-stt-faster | same |
| Faster Whisper Small EN | faster_whisper | ml-stt-faster | same |
ml-all. Pick the extra for the runtime you use.
Weight paths (edge_model_path)
| Model family | Cache / load path | Notes |
|---|---|---|
| whisper.cpp | File path to GGML binary, e.g. models/whisper/ggml-tiny.en-q5_1.bin | download_url in metadata; first run fetches into edge model cache |
| faster-whisper | Cache key = model_external_id (e.g. tiny.en); metadata.edge_model_path is documentation only | Edge Core pre-downloads via catalog model_external_id; CTranslate2 cache under /app/models/tiny.en/ |
Compile server
When a workflow hasrun_on_edge: true and a Call Model node references an on-device STT model, compile verifies imports:
| Runtime | Import checked | base.txt package |
|---|---|---|
whisper_cpp | pywhispercpp.model.Model | pywhispercpp==1.4.1 |
faster_whisper | faster_whisper.WhisperModel | faster-whisper>=1.1.1 |
This workflow uses Call Model with faster-whisper but faster-whisper is not installed on the compile server. Install faster-whisper (or cyberwave[ml-stt-faster]) on Django before compiling for edge.Rebuild Django after updating
requirements/base.txt — see Edge workflow dependencies.
Edge worker
- Install the catalog model’s extra (
ml-sttorml-stt-faster). - Sync workflow —
edge-synclistsmodel_requirementswithedge_model_path. - First inference downloads weights (or uses pre-staged files under
~/.cyberwave/models/).
sample_rate_hz, channels, and optional language / task / vad_filter (Faster Whisper built-in VAD when enabled).
Choosing whisper.cpp vs Faster Whisper
whisper.cpp (ml-stt) | Faster Whisper (ml-stt-faster) | |
|---|---|---|
| Hardware | Raspberry Pi 4 class | Pi 4+; Jetson/GPU for Base/Small |
| Weights | Quantized GGML (.bin) | CTranslate2 cache dir |
| Latency | Good for Tiny on CPU | Tiny EN optimized for real-time |
| Built-in VAD | No (use Audio Assistant upstream) | Yes (builtin_vad_filter on catalog) |
Typical wiring
audio key. Wire result (text) into Fuzzy Matcher’s Uncertain String (query).
Related
Edge dependencies
Full matrix (all nodes + compile server)
Audio in Workflows
PCM format and pipelines
Fuzzy Matcher
Map STT text to commands