SmolVLA Training - Cyberwave Docs

Overview

SmolVLA is a lightweight Vision-Language-Action model optimized for edge deployment. Cyberwave supports fine-tuning SmolVLA models on your custom datasets using the LeRobot v3 dataset format.

SmolVLA models use the LeRobot v3 dataset format for training, which differs from the TFDS format used by OpenVLA models. The platform handles format conversion automatically.

Model Selection

When starting a new training:

Navigate to AI → Training in your environment
Select SmolVLA from the available model architectures
The platform will automatically use the LeRobot training pipeline

Only models with is_trainable: true appear in the training model selection. SmolVLA is pre-configured as trainable.

Dataset Conversion

When you start training with a SmolVLA model, the platform:

Joins your episode parquet files into a single dataset
Converts the OpenVLA-format parquet to LeRobot v3 format
Handles camera role mapping (primary, wrist, secondary)
Encodes video frames using AV1 codec (configurable)

The conversion is cached — subsequent trainings with the same dataset and configuration skip the conversion step.

Training Parameters

SmolVLA training supports:

Parameter	Description	Default
`fps`	Target frames per second	30
`use_videos`	Store frames as MP4 videos	true
`vcodec`	Video codec	libsvtav1
`num_cameras`	Number of camera streams (1-3)	1

Camera Configuration

Camera roles are mapped to LeRobot conventions:

primary → observation.images.primary
wrist → observation.images.wrist
secondary → observation.images.secondary

Configure camera roles in the training wizard or via the API.

Training Workflow

Deployment

After training completes:

Deploy the trained model as a controller policy
Assign the VLA controller to your robot twin
Use natural language prompts to control the robot

SmolVLA models are optimized for edge inference, making them suitable for real-time robot control with lower latency than larger VLA models.

Train VLA Models — Complete training tutorial
ML Models Overview — Model capabilities and providers
Deploy Models — Deployment options

​Overview

​Model Selection

​Dataset Conversion

​Training Parameters

​Camera Configuration

​Training Workflow

​Deployment

​Related

Overview

Model Selection

Dataset Conversion

Training Parameters

Camera Configuration

Training Workflow

Deployment

Related