> ## Documentation Index > Fetch the complete documentation index at: https://docs.cyberwave.com/llms.txt > Use this file to discover all available pages before exploring further. # Models & Datasets > Run any AI model (open-source, proprietary, your own) on any twin, with the data plumbing already taken care of AI models are how your [twins](/overview/features/digital-twin) stop being puppets and start being autonomous. Cyberwave is the substrate that gets them there: a growing [catalog of models](https://cyberwave.com/models), the infrastructure to run them on cloud or edge, and the [datasets](https://cyberwave.com/datasets) you need to train your own. ```python theme={null} from cyberwave import Cyberwave cw = Cyberwave() arm = cw.twin("acme/twins/arm-station-1") arm.use_controller("acme/models/my-pick-and-place-vla") ``` That's it: the model is now driving the robot. Same call works in [simulation](/overview/features/simulation) (`cw.affect("simulation")`) and against live hardware. *** ## Bring your own model, or pick from the catalog Pick from open-source and proprietary models: VLAs (SmolVLA, OpenVLA, Pi 0.5), VLMs (Gemini Robotics, GPT-5, Molmo, PaliGemma), detectors (YOLOv8, SAM2), and image-to-3D (Hunyuan3D, TripoSR). Fine-tune [SmolVLA](/feature-reference/ml-models/smolvla-training) or [OpenVLA](/feature-reference/ml-models/openvla-oft-training) on the data you collected with your own robots, with no separate training stack. Bring your own weights or endpoint: Hugging Face, an internal inference server, a custom ONNX file. Cyberwave treats it as a first-class model. Every registered model gets a [Model Playground](/feature-reference/ml-models/playground) page where you can try it on an image, see overlays, copy CLI/SDK snippets, and wire it into a workflow, so experimenting with a new model is genuinely one click. *** ## Use them anywhere in your stack Cyberwave models compose with the rest of the platform. Same model, different roles: | Where the model runs | What it does | Reference | | ------------------------ | --------------------------------------------------------------------- | ------------------------------------------------------------------------------------ | | **As a twin controller** | Drive a robot end-to-end (a VLA picks-and-places, an RL policy walks) | [Controllers](/overview/index#controllers) | | **Inside a workflow** | One `call_model` node, runs cloud VLM or edge ML transparently | [Workflow nodes](/feature-reference/workflows#workflow-components) | | **From the SDK** | Async cloud calls for VLM / LLM tasks | [`vlm_generation` / `llm_generation`](/tools/python-sdk#workflows) | | **In simulation** | The same code drives a [MuJoCo twin](/overview/features/simulation) | [`cw.affect("simulation")`](/tools/python-sdk#affect-simulation-vs-live-environment) | The best automations usually combine more than one: an edge YOLO that's cheap to run on every frame **and** a cloud VLM that reasons about the rare interesting frame. See the [edge-to-cloud VLM tutorial](/tutorials/edge-to-cloud-vlm) for that pattern end-to-end. *** ## Edge + cloud, both first-class Cyberwave runs models in both places, on purpose. Local inference on your own hardware: fast, private, offline-capable. YOLO, SAM2, ONNX/TensorRT detectors all run inside the [edge worker](/feature-reference/edge/workers/overview) generated from your workflow. Heavy-weight VLMs and VLAs (Gemini Robotics, GPT-5, OpenVLA) run on a [Cloud Node](/tools/cloud-node) or [VLA Cloud Node](/tools/vla-cloud-node) with a GPU attached; Cyberwave handles the provisioning. You don't manage any of the boring parts: weight downloads, GPU provisioning, picking the right edge-compatible variant, network reconnects, version pinning. Cyberwave routes the right model to the right runtime; see the [model deployment reference](/feature-reference/ml-models/deploy) for the wire-level details. *** ## Datasets: collect, import, export Models need data. Cyberwave gives you the loop end-to-end: record on the edge → [replay in the browser](/overview/features/replay-and-historical-data) → slice into episodes → train. Every recording in [Replay](/overview/features/replay-and-historical-data) can be turned into episodes and a dataset, ready to train on. LeRobot v3 / v2.1, RLDS, HDF5, Zarr, GR00T, MCAP, ROS bag, RoboDM, Hugging Face; see the [full import matrix](/feature-reference/datasets/import). Convert any dataset to Cyberwave Parquet, LeRobot v3, RLDS, or OpenVLA TFDS via the [export tab or API](/feature-reference/datasets/format-conversion). Browse the [public dataset catalog](https://cyberwave.com/datasets), pick one, and feed it straight into [SmolVLA](/feature-reference/ml-models/smolvla-training) or [OpenVLA](/feature-reference/ml-models/openvla-oft-training) training; the format conversion happens for you. *** ## What to read next Capabilities, providers, registration, inference, and the full VLA stack. The interactive page behind every model in the catalog. A community tutorial: collect data, fine-tune SmolVLA, run it on a real arm, all on Cyberwave.