Skip to content

Getting Started

Start from zero and learn to write basic automation flows.


File Structure

Each .yaml file consists of two sections, separated by ---:

yaml
# ═══ Part 1: Configuration (Header) ═══
appId: com.example.app         # Required: target App package name
name: My Flow                  # Optional: give your flow a name
tags: [test, login]            # Optional: tags for categorization
env:                           # Optional: variable definitions
  USERNAME: admin

# ═══ Separator ═══
---

# ═══ Part 2: Commands (Step List) ═══
- launchApp
- tapOn: "Login"
- inputText: "hello"
Config FieldRequiredDescription
appIdRequiredTarget App package name, e.g., com.android.settings
nameOptionalDisplay name of the flow
tagsOptionalTag array for categorization
envOptionalEnvironment variables, referenced via ${KEY} in commands
urlOptionalURL to open automatically when the flow starts
propertiesOptionalCustom key-value pairs
onFlowStartOptionalCommands to execute before the flow starts (see Advanced)
onFlowCompleteOptionalCommands to execute after the flow ends (see Advanced)
exceptionHandlersOptionalAuto-handle popups (see Intermediate)

Your First Flow: Open Settings

Create a file my-first-flow.yaml:

yaml
appId: com.android.settings
name: My First Flow
---
- launchApp
- assertVisible: "Settings"

This flow does two things:

  1. Launches the Android Settings App
  2. Verifies that the word "Settings" appears on screen

If "Settings" appears, the flow succeeds; if it doesn't appear within 17 seconds, the flow throws an error.


Tap an Element: tapOn

tapOn is the most commonly used command -- it finds an element on screen and taps it.

Short Form: Text Directly

yaml
- tapOn: "Login"

The engine finds the element displaying "Login" on screen and taps it.

Object Form: More Control

yaml
- tapOn:
    text: "Login"

This has the exact same effect as the short form, but the object form allows additional parameters:

yaml
- tapOn:
    text: "Login"
    optional: true              # Don't throw an error if not found; skip and continue
yaml
- tapOn:
    id: "btn_login"             # Find by element ID (more precise than text)
yaml
- tapOn:
    text: "Submit"
    retryTapIfNoChange: true    # Automatically retry if the screen doesn't change after tap
yaml
- tapOn:
    text: "Loading Button"
    waitUntilVisible: true      # Wait for the element to appear before tapping

Wait for Screen to Settle

yaml
- tapOn:
    text: "Login"
    waitToSettleTimeoutMs: 3000  # Wait for screen to settle after tap, up to 3 seconds

waitToSettleTimeoutMs has an upper limit of 30000 milliseconds; values above that are automatically capped at 30000.

Repeated Taps

yaml
- tapOn:
    text: "+"
    repeat: 5                   # Tap 5 times consecutively
    delay: 200                  # 200 ms between each tap (default: 100 ms)

Input Text: inputText

Enters text into the currently focused input field.

yaml
- inputText: "hello world"

Typically used with tapOn to first tap the input field:

yaml
- tapOn: "Username"             # Tap the input field first
- inputText: "admin"            # Then enter text

Supports variables (see Intermediate):

yaml
- inputText: "${USERNAME}"

Numbers can also be entered directly:

yaml
- inputText: 123456

Erase Text: eraseText

Deletes text from an input field.

yaml
- eraseText                     # Delete up to 50 characters (default)

Specify the number of characters to delete (two forms):

yaml
- eraseText: 10                 # Short form: delete 10 characters

- eraseText:
    charactersToErase: 10       # Object form: delete 10 characters

Swipe: swipe

Simulates a finger swipe gesture.

Directional Swipe

yaml
# Short form
- swipe: UP                     # Swipe up

# Object form
- swipe:
    direction: UP               # Swipe up (commonly used for browsing news or scrolling videos)

Four directions: UP, DOWN, LEFT, RIGHT.

Control Swipe Speed

yaml
- swipe:
    direction: UP
    duration: 400               # Swipe takes 400 ms (higher value = slower)

Random Speed

yaml
- swipe:
    direction: UP
    duration: [200, 500]        # Random duration between 200~500 ms each time

Swipe from an Element's Position

yaml
- swipe:
    direction: LEFT
    from:                       # Start swiping from the center of this element
      text: "Card Title"
    duration: 500

Don't Wait for Screen to Settle After Swipe

yaml
- swipe:
    direction: UP
    waitToSettleTimeoutMs: 0    # Don't wait for screen to settle after swipe (suitable for rapid consecutive swipes)

Scroll: scroll

The simplest scroll -- swipe up once to scroll the page content down.

yaml
- scroll

Equivalent to swipe: { direction: UP }, but more concise.


Back Button: back

Simulates pressing the Android back button.

yaml
- back

Can also be used to dismiss the keyboard.


Wait and Delay: sleep

Pauses execution for a specified duration.

Fixed Wait

yaml
- sleep: 2000                   # Wait 2 seconds (unit: milliseconds)

Random Wait

yaml
- sleep: [3000, 8000]           # Random wait between 3~8 seconds

Object Form

yaml
# Fixed wait
- sleep:
    duration: 2000

# Random range
- sleep:
    min: 3000
    max: 8000

Verify the Screen: assertVisible / assertNotVisible

Checks whether a certain element is present on screen. If the check fails, the flow stops with an error.

Assert Element Is Visible

yaml
- assertVisible: "Welcome back"  # "Welcome back" must appear on screen

By default, waits up to 17 seconds. If it appears within 17 seconds, the check passes; otherwise, an error is thrown.

Assert Element Is Not Visible

yaml
- assertNotVisible: "Loading..."  # "Loading..." must not appear on screen

Checks continuously for 7 seconds. If it never appears, the check passes; if it does appear, an error is thrown.

Object Form

yaml
- assertVisible:
    text: "Login successful"
    id: "success_message"       # Match both text and ID simultaneously

Launch and Stop Apps: launchApp / stopApp

Launch App

yaml
- launchApp                     # Launch the App specified by appId in the configuration

Launch a different App:

yaml
- launchApp: com.example.other  # Launch the App with the specified package name

Clear all App storage data on launch:

yaml
- launchApp:
    appId: com.example.app
    clearState: true            # Clear all App data

All launchApp Parameters

yaml
- launchApp:
    appId: com.example.app      # Specify package name (uses config appId if omitted)
    clearState: true            # Clear all App data
    stopApp: true               # Stop the running instance first (default: true)
    permissions:                # Grant permissions on launch
      android.permission.CAMERA: allow
    arguments:                  # Launch arguments
      key: "value"

Stop App

yaml
- stopApp                       # Stop the current App
- stopApp: com.example.app      # Stop the specified App

Force Kill App

yaml
- killApp: com.example.app

Clear App Data

yaml
- clearState                    # Clear current App data
- clearState: com.example.app   # Clear specified App data

Hide Keyboard: hideKeyboard

yaml
- hideKeyboard

Can also be written as hide keyboard (with a space):

yaml
- hide keyboard

Press Key: pressKey

Simulates pressing a physical or virtual key.

yaml
- pressKey: Home                # Go to home screen
- pressKey: Back                # Go back (equivalent to the back command)
- pressKey: Enter               # Enter/confirm
- pressKey: VolumeUp            # Volume up
- pressKey: VolumeDown          # Volume down

You can also pass an Android keyCode number directly:

yaml
- pressKey: 66                  # 66 = KEYCODE_ENTER

See the Cheat Sheet for the full list of key names.


Getting Started Practice: Login Flow

Combine the commands you've learned to perform a real login operation:

yaml
appId: com.example.app
name: Login Flow
env:
  USERNAME: test_user
  PASSWORD: secret123
---
# 1. Launch the App
- launchApp

# 2. Enter username
- tapOn: "Username"
- inputText: "${USERNAME}"

# 3. Enter password
- tapOn: "Password"
- inputText: "${PASSWORD}"

# 4. Tap the login button (wait for the screen to settle before continuing)
- tapOn:
    text: "Login"
    waitToSettleTimeoutMs: 3000

# 5. Verify login success
- assertVisible: "Welcome"

Ready to write more flexible flows? → Intermediate

Powered by VMOS Edge Team