Documentation Index
Fetch the complete documentation index at: https://docs.cyberwave.com/llms.txt
Use this file to discover all available pages before exploring further.
Cyberwave is in Private Beta.
Request early access to get access to the Cyberwave dashboard.
What It Provides
The native microphone driver runs on the edge device that has the microphone attached. It captures audio throughsounddevice / PortAudio, sends a WebRTC audio stream to the twin, and publishes raw chunks to the local Zenoh data bus on audio/default by default.
It also subscribes to:
{"command": "start_audio", "source_type": "tele"} and {"command": "stop_audio", "source_type": "tele"} to start and stop recording for the active WebRTC stream. The driver also accepts start_recording and stop_recording as direct aliases on the same topic.
When WebRTC is already connected, these commands publish minimal media-service commands on cyberwave/twin/{twin_uuid}/webrtc-command and keep the WebRTC connection alive. CYBERWAVE_METADATA_AUTO_RECORDING_AUDIO=false only disables startup recording; manual commands can still start recording when CYBERWAVE_METADATA_ENABLE_RECORDING=true.
Setup
- Create or select a microphone twin in your Cyberwave environment.
- Pair the edge device through Cyberwave Edge so the driver receives
CYBERWAVE_API_KEYandCYBERWAVE_TWIN_UUID. - Attach a USB or built-in microphone to the edge device.
- Start the driver with Docker or Edge Core. On Linux Docker hosts, pass the audio device into the container:
--privileged for ALSA device access. Use the narrower /dev/snd mapping first.
Configuration
| Variable | Default | Purpose |
|---|---|---|
CYBERWAVE_METADATA_AUDIO_DEVICE | default | Select an input by index, name fragment, or default. |
CYBERWAVE_METADATA_ENABLE_AUDIO | true | Enables WebRTC startup and reconnect. If false, no WebRTC audio or recording starts. Maps from enable_audio in the twin JSON. |
CYBERWAVE_METADATA_ENABLE_RECORDING | true | Enables recording commands. If false, start_audio / stop_audio recording commands are rejected, while WebRTC audio can still run. Maps from enable_recording in the twin JSON. |
CYBERWAVE_METADATA_AUTO_RECORDING_AUDIO | false | Starts recording with the initial WebRTC offer only when audio and recording are enabled. start_audio can still start recording later when this is false. Maps from auto_recording_audio in the twin JSON. |
CYBERWAVE_METADATA_AUDIO_CHANNEL | audio/default | Zenoh channel for raw audio chunks. |
CYBERWAVE_METADATA_AUDIO_MIC_NAME | audio | WebRTC sensor identifier. |
CYBERWAVE_METADATA_AUDIO_SAMPLE_RATE | OS default | Capture sample rate. |
CYBERWAVE_METADATA_AUDIO_CHANNELS | 1 | Capture channels; auto-detection can upgrade to stereo. |
audio/live/default.
To run multiple microphones on one device, run multiple driver instances. Give each instance a different CYBERWAVE_TWIN_UUID and set CYBERWAVE_METADATA_AUDIO_DEVICE to the desired input.
Linux Audio Notes
The driver image installslibportaudio2, which gives sounddevice access to PortAudio’s ALSA backend. If the host routes audio through PulseAudio or PipeWire, also make the relevant Pulse/PipeWire socket and client libraries available to the container.
The driver logs all input devices at startup and publishes selected-device metadata to the twin. Set CYBERWAVE_LOG_LEVEL=DEBUG to see raw environment values and the resolved microphone configuration.
macOS Notes
Bare-metal macOS capture uses CoreAudio throughsounddevice. The terminal or process launcher must have microphone permission in System Settings before the driver can capture audio.
run-local.sh support is planned; until it exists, run the Python driver from the package environment with the same CYBERWAVE_* variables used by Docker.
Edge Cases
- No audio devices at startup: the driver enumerates devices and fails configuration if none are available. Publishing a microphone sensor-failure alert before retrying with backoff is the expected runtime behavior.
- Device disconnected mid-stream: the Linux/macOS device monitor detects add/remove events. The expected behavior is to stop WebRTC, publish a sensor-failure alert, and reconnect when the device returns.
- Docker access: prefer
/dev/sndplus theaudiogroup; use privileged mode only when host audio permissions require it. - Cloud STT URL expiry: use signed URLs with enough TTL for queued workloads, or inline
audio_base64for small files. - Large STT inputs: keep Whisper jobs below roughly
25 MB; oversized inputs should fail with a clear validation error.
Dual Audio Streaming Paths
The driver streams captured audio on two independent, parallel paths:| Path | Output rate | Resampling | Metadata | Consumer |
|---|---|---|---|---|
| WebRTC (Opus) | 48 kHz (always) | Yes, if hardware ≠ 48 kHz | stream_attributes.sample_rate in MQTT offer | Frontend, media service |
| Zenoh (raw PCM) | Hardware native | Never | sample_rate_hz, channels, encoding, layout in wire header | @cw.on_audio workers |
WebRTC path
Audio is resampled to 48 kHz (the Opus codec’s internal rate) before entering the WebRTC queue. Thestream_attributes field in the MQTT webrtc-offer payload includes the actual sample_rate used, so the media service and frontend can verify compliance. The media service router uses standard mediasoup Opus negotiation — no custom validation is needed.
Zenoh path
Raw PCM chunks are published at the hardware’s native capture rate with no resampling. On the first publish, the Zenoh wire header carries metadata:@cw.on_audio workers receive this metadata so they can correctly interpret the raw audio bytes.
Parallelism
The PortAudio callback places raw audio into a zero-copy swap buffer for Zenoh (O(1)) and queues resampled audio for WebRTC. Three threads run in true parallel: PortAudio capture, Zenoh publisher, and WebRTC streamer.Twin metadata
The driver publishes both rates to the twin metadata underaudio_device:
| Field | Description |
|---|---|
capture_sample_rate | Hardware native rate (e.g. 32000) |
stream_sample_rate | WebRTC output rate (48000 when resampling is on) |
channels | Channel count |
layout | "mono" or "stereo" |
software_resampling | Whether resampling is active |
Success Checks
docker compose upwith a USB microphone streams audio to the microphone twin.- Frontend MQTT
start_audio/stop_audiotoggles recording while preserving an already-active WebRTC connection. - Zenoh
audio/defaultchunks are consumable by anon_audioworker hook at the hardware’s native sample rate. - Startup logs list available devices;
CYBERWAVE_METADATA_AUDIO_DEVICEselects a specific one. - USB disconnect and reconnect transitions through alert, reconnect, and alert resolution.