Skip to content

VMOS Edge CLI Official Guide

@vmosedge/cli is the official command-line tool for VMOS Edge Desktop. It can be used to manage cloud phone instances, query host and image status, run desktop automation, and execute tasks in batch through JSON or YAML workflows. For most first-time users, the recommended path is to start with AI Agent + the official Skill, then gradually move deeper into pure CLI usage, batch processing, and workflow orchestration.

Recommended Path

  • If you are new or want the fastest onboarding path, start with AI Agent + the official Skill.
  • If you need scripting, batch execution, workflow orchestration, or fine-grained debugging, switch to pure CLI afterward.
PathBest ForNotes
AI Agent + Official SkillFirst-time VMOS Edge CLI users and users who want to get started quicklyLets the agent invoke CLI capabilities according to the official rules, with a lower learning curve
Pure CLIUsers who need to debug single commands, write batch JSON, or maintain YAML workflowsMore flexible and better suited for advanced usage

Who Should Use AI Agent + the Official Skill

  • Users trying VMOS Edge CLI for the first time
  • Users who want to access VMOS Edge automation quickly
  • Users who want to connect VMOS Edge to Codex, Cursor, Claude Code, Gemini CLI, GitHub Copilot, OpenClaw, or similar agents
  • Users who do not want to memorize many commands, parameters, and workflow formats at the beginning

Install the Official Skill

List the available skills in the official skills repository:

bash
npx skills add https://github.com/vmos-dev/vmos-edge-skills --list

Install operate-vmos-edge-cli:

bash
npx skills add https://github.com/vmos-dev/vmos-edge-skills --skill operate-vmos-edge-cli

Prerequisites

Before installing the skill, make sure the local environment is ready:

  • Node.js 18 or later is installed
  • npm is available in the system
  • VMOS Edge Desktop is installed
  • The official CLI @vmosedge/cli is installed
  • After installation, vmos-edge-cli schema can be used to verify that the CLI is available

For the first setup, use the following order for installation and verification.

1. Check Node.js and the CLI Environment

bash
node --version
vmos-edge-cli --version

2. Install the Official CLI

bash
npm i -g @vmosedge/cli

3. Verify That the CLI Works

bash
vmos-edge-cli schema

4. Install the Official Skill

bash
npx skills add https://github.com/vmos-dev/vmos-edge-skills --skill operate-vmos-edge-cli

Installation Note

The officially recommended installation method is npm i -g @vmosedge/cli. Do not replace it with node dist/main.js, pnpm build, or pnpm link.

First-Time Usage Recommendation

For your first use, the recommended path is:

  1. Install @vmosedge/cli
  2. Run vmos-edge-cli schema to confirm the CLI is available
  3. Install the operate-vmos-edge-cli skill
  4. Use the skill from an AI agent to access VMOS Edge capabilities
  5. Move to pure CLI mode later when you need to debug single commands or write batch JSON or YAML workflows

This order better matches the recommended official workflow: complete the environment preflight first, inspect the current state, run the action, then verify the result.

Pure CLI Installation and Validation

If you plan to use vmos-edge-cli directly, it is recommended to complete one full standard preflight first.

Standard Preflight Flow

bash
node --version
vmos-edge-cli --version
npm i -g @vmosedge/cli
vmos-edge-cli schema

This set of commands matches the standard Preflight flow:

  • Check the Node.js version
  • Check whether the CLI is already installed
  • Install it if needed
  • Run schema as the final verification step

What to Do If a Check Fails

  • node is missing: install Node.js 18+
  • npm is missing: Node.js is usually not installed correctly
  • vmos-edge-cli is missing: run the install command and verify again
  • schema fails: fix the error before continuing

Desktop Client Path Configuration

If VMOS Edge Desktop is not installed in the default location, set app.bin-path before running app start.

Common default paths are:

PlatformDefault Path
WindowsC:\Program Files\VMOS Edge 2.0\VMOS Edge 2.0.exe
macOS/Applications/VMOS Edge 2.0.app
Linux/opt/vmos-edge/vmos-edge

Set the path:

bash
vmos-edge-cli config set app.bin-path "your actual install path"

View the current config:

bash
vmos-edge-cli config show

First End-to-End Run

If you are using pure CLI for the first time, walk through the steps below once.

1. Start the Desktop Client

bash
vmos-edge-cli app status
vmos-edge-cli app start
vmos-edge-cli app wait-ready

2. Check the Host Status

bash
vmos-edge-cli host check 192.168.1.100
vmos-edge-cli host info 192.168.1.100

3. View Images and Devices

bash
vmos-edge-cli image list --host 192.168.1.100
vmos-edge-cli device list --host 192.168.1.100

4. Create a Cloud Device

bash
vmos-edge-cli device create --host 192.168.1.100 --image <image_id> --name demo --count 1

5. Start the Device and Check Details

bash
vmos-edge-cli device start --host 192.168.1.100 <device_id>
vmos-edge-cli device info --host 192.168.1.100 <device_id>

Usage Notes

  • Most device and image commands require --host <ip>
  • host commands usually take the IP as a positional argument
  • The desktop app stays running between commands, so you usually do not need to repeat app start

Three Invocation Modes

The official CLI defines three invocation modes: Direct, Batch, and Run.

Direct

Use this when you want to execute one action, or when you need to inspect the result before deciding the next step.

bash
vmos-edge-cli device list --host 10.0.0.5

Batch

Use this for a group of actions that are independent from one another and safe to run in sequence.

bash
vmos-edge-cli batch '[
  {"action":"device.list","args":{"host":"10.0.0.5"}},
  {"action":"image.list","args":{"host":"10.0.0.5"}},
  {"action":"host.hardware","args":{"ip":"10.0.0.5"}}
]'

Run

Use this when you want to save a full sequence of actions into a YAML file and run it repeatedly.

Core decision rules:

  • If each step can run without depending on the previous result, use Batch
  • If you must inspect a result before deciding the next step, use Direct
  • If you want a reusable workflow, use Run

Before Writing Batch or Workflows

Before writing batch JSON or YAML, it is recommended to run schema first instead of guessing action names and parameter names.

bash
vmos-edge-cli schema
vmos-edge-cli schema device
vmos-edge-cli schema ui

Keep these rules in mind:

  • device, host, and image parameters usually use snake_case
  • ui parameters usually use camelCase
  • Positional CLI parameters often become named fields in batch / YAML
  • sleep is only available in batch / run, while most other actions also have direct command forms

How to Use Desktop Automation

The recommended desktop automation flow is simple: read the current page state, perform the interaction, then read the state again after the page changes.

1. Read the Current Page State

bash
vmos-edge-cli ui state

If the page contains many elements and you only want interactive ones:

bash
vmos-edge-cli ui state --interactive-only

2. Click or Type

bash
vmos-edge-cli ui click 3
vmos-edge-cli ui type 5 "test text"

3. Re-read the State After Page Changes

After click, goto, back, or type, run ui state again. Do not reuse old element indices.

Desktop Automation Best Practices

Prefer ui state

For page inspection, element targeting, and state reading, prefer ui state. Use ui screenshot only when you explicitly need to save an image result.

Prefer ui click / ui type

For interactions, prefer official actions such as ui click, ui type, ui select, and ui native-type. Do not use ui eval to simulate clicks or input.

Check Whether Elements Are Off-Screen or Blocked

If the target element is outside the viewport, inspect hidden_interactive first and then use ui scroll-to to bring it into view. If a dialog blocks the element, close the dialog before continuing.

Use ui native-type for Chinese or CJK Input

For Chinese, input method, or other CJK scenarios, focus the input field first and then use ui native-type.

Command Results and Error Troubleshooting

Response Structure

CLI output usually follows a unified outer JSON structure.

A successful result usually looks like:

json
{"ok": true, "data": ...}

A failed result usually looks like:

json
{"ok": false, "error": "...", "code": "..."}

When troubleshooting, inspect code first and error second.

Common Errors

Error CodeMeaningRecommended Action
HOST_NOT_SETMissing host parameterAdd --host <ip>
INVALID_ARGSInvalid parameter structureFix the parameters before retrying
DEVICE_NOT_FOUNDInvalid device IDRun device list again and use a valid ID
IMAGE_NOT_FOUNDInvalid image IDRun image list again and use a valid image ID
ELEMENT_NOT_FOUNDThe target element is not on the current pageRun ui state again and use the new index
APP_NOT_RUNNINGThe desktop client is not runningRun app start first
HOST_UNREACHABLEThe host is not reachable over the networkCheck the IP and network connectivity
CDP_NOT_READYThe UI automation channel is not readyWait a moment and retry, and run app start first if needed
TIMEOUT / TRANSIENTTemporary timeout or transient failureRetry only if the scenario would normally complete on its own

Avoid repeatedly retrying the following errors without changing any conditions:

  • INVALID_ARGS
  • HOST_NOT_SET
  • DEVICE_NOT_FOUND
  • IMAGE_NOT_FOUND
  • ELEMENT_NOT_FOUND
  • ASSERTION_FAILED

Shortest Onboarding Path for New Users

If this is your first time with VMOS Edge CLI, this is the shortest recommended path:

  1. Install Node.js 18+
  2. Run npm i -g @vmosedge/cli
  3. Run vmos-edge-cli schema
  4. Run npx skills add https://github.com/vmos-dev/vmos-edge-skills --skill operate-vmos-edge-cli
  5. Start using VMOS Edge capabilities from your AI agent
  6. Add pure CLI usage later when you need debugging or workflow orchestration

One-Sentence Summary

For most users, AI Agent + the official Skill is the best first path, while pure CLI is better for advanced debugging and workflow orchestration. Complete the Preflight before formal use, and in desktop automation always prefer ui state and re-read the state after the page changes.

Powered by VMOS Edge Team