Getting Started
Start from zero and learn to write basic automation flows.
File Structure
Each .yaml file consists of two sections, separated by ---:
# ═══ Part 1: Configuration (Header) ═══
appId: com.example.app # Required: target App package name
name: My Flow # Optional: give your flow a name
tags: [test, login] # Optional: tags for categorization
env: # Optional: variable definitions
USERNAME: admin
# ═══ Separator ═══
---
# ═══ Part 2: Commands (Step List) ═══
- launchApp
- tapOn: "Login"
- inputText: "hello"| Config Field | Required | Description |
|---|---|---|
appId | Required | Target App package name, e.g., com.android.settings |
name | Optional | Display name of the flow |
tags | Optional | Tag array for categorization |
env | Optional | Environment variables, referenced via ${KEY} in commands |
url | Optional | URL to open automatically when the flow starts |
properties | Optional | Custom key-value pairs |
onFlowStart | Optional | Commands to execute before the flow starts (see Advanced) |
onFlowComplete | Optional | Commands to execute after the flow ends (see Advanced) |
exceptionHandlers | Optional | Auto-handle popups (see Intermediate) |
Your First Flow: Open Settings
Create a file my-first-flow.yaml:
appId: com.android.settings
name: My First Flow
---
- launchApp
- assertVisible: "Settings"This flow does two things:
- Launches the Android Settings App
- Verifies that the word "Settings" appears on screen
If "Settings" appears, the flow succeeds; if it doesn't appear within 17 seconds, the flow throws an error.
Tap an Element: tapOn
tapOn is the most commonly used command -- it finds an element on screen and taps it.
Short Form: Text Directly
- tapOn: "Login"The engine finds the element displaying "Login" on screen and taps it.
Object Form: More Control
- tapOn:
text: "Login"This has the exact same effect as the short form, but the object form allows additional parameters:
- tapOn:
text: "Login"
optional: true # Don't throw an error if not found; skip and continue- tapOn:
id: "btn_login" # Find by element ID (more precise than text)- tapOn:
text: "Submit"
retryTapIfNoChange: true # Automatically retry if the screen doesn't change after tap- tapOn:
text: "Loading Button"
waitUntilVisible: true # Wait for the element to appear before tappingWait for Screen to Settle
- tapOn:
text: "Login"
waitToSettleTimeoutMs: 3000 # Wait for screen to settle after tap, up to 3 seconds
waitToSettleTimeoutMshas an upper limit of 30000 milliseconds; values above that are automatically capped at 30000.
Repeated Taps
- tapOn:
text: "+"
repeat: 5 # Tap 5 times consecutively
delay: 200 # 200 ms between each tap (default: 100 ms)Input Text: inputText
Enters text into the currently focused input field.
- inputText: "hello world"Typically used with tapOn to first tap the input field:
- tapOn: "Username" # Tap the input field first
- inputText: "admin" # Then enter textSupports variables (see Intermediate):
- inputText: "${USERNAME}"Numbers can also be entered directly:
- inputText: 123456Erase Text: eraseText
Deletes text from an input field.
- eraseText # Delete up to 50 characters (default)Specify the number of characters to delete (two forms):
- eraseText: 10 # Short form: delete 10 characters
- eraseText:
charactersToErase: 10 # Object form: delete 10 charactersSwipe: swipe
Simulates a finger swipe gesture.
Directional Swipe
# Short form
- swipe: UP # Swipe up
# Object form
- swipe:
direction: UP # Swipe up (commonly used for browsing news or scrolling videos)Four directions: UP, DOWN, LEFT, RIGHT.
Control Swipe Speed
- swipe:
direction: UP
duration: 400 # Swipe takes 400 ms (higher value = slower)Random Speed
- swipe:
direction: UP
duration: [200, 500] # Random duration between 200~500 ms each timeSwipe from an Element's Position
- swipe:
direction: LEFT
from: # Start swiping from the center of this element
text: "Card Title"
duration: 500Don't Wait for Screen to Settle After Swipe
- swipe:
direction: UP
waitToSettleTimeoutMs: 0 # Don't wait for screen to settle after swipe (suitable for rapid consecutive swipes)Scroll: scroll
The simplest scroll -- swipe up once to scroll the page content down.
- scrollEquivalent to swipe: { direction: UP }, but more concise.
Back Button: back
Simulates pressing the Android back button.
- backCan also be used to dismiss the keyboard.
Wait and Delay: sleep
Pauses execution for a specified duration.
Fixed Wait
- sleep: 2000 # Wait 2 seconds (unit: milliseconds)Random Wait
- sleep: [3000, 8000] # Random wait between 3~8 secondsObject Form
# Fixed wait
- sleep:
duration: 2000
# Random range
- sleep:
min: 3000
max: 8000Verify the Screen: assertVisible / assertNotVisible
Checks whether a certain element is present on screen. If the check fails, the flow stops with an error.
Assert Element Is Visible
- assertVisible: "Welcome back" # "Welcome back" must appear on screenBy default, waits up to 17 seconds. If it appears within 17 seconds, the check passes; otherwise, an error is thrown.
Assert Element Is Not Visible
- assertNotVisible: "Loading..." # "Loading..." must not appear on screenChecks continuously for 7 seconds. If it never appears, the check passes; if it does appear, an error is thrown.
Object Form
- assertVisible:
text: "Login successful"
id: "success_message" # Match both text and ID simultaneouslyLaunch and Stop Apps: launchApp / stopApp
Launch App
- launchApp # Launch the App specified by appId in the configurationLaunch a different App:
- launchApp: com.example.other # Launch the App with the specified package nameClear all App storage data on launch:
- launchApp:
appId: com.example.app
clearState: true # Clear all App dataAll launchApp Parameters
- launchApp:
appId: com.example.app # Specify package name (uses config appId if omitted)
clearState: true # Clear all App data
stopApp: true # Stop the running instance first (default: true)
permissions: # Grant permissions on launch
android.permission.CAMERA: allow
arguments: # Launch arguments
key: "value"Stop App
- stopApp # Stop the current App
- stopApp: com.example.app # Stop the specified AppForce Kill App
- killApp: com.example.appClear App Data
- clearState # Clear current App data
- clearState: com.example.app # Clear specified App dataHide Keyboard: hideKeyboard
- hideKeyboardCan also be written as hide keyboard (with a space):
- hide keyboardPress Key: pressKey
Simulates pressing a physical or virtual key.
- pressKey: Home # Go to home screen
- pressKey: Back # Go back (equivalent to the back command)
- pressKey: Enter # Enter/confirm
- pressKey: VolumeUp # Volume up
- pressKey: VolumeDown # Volume downYou can also pass an Android keyCode number directly:
- pressKey: 66 # 66 = KEYCODE_ENTERSee the Cheat Sheet for the full list of key names.
Getting Started Practice: Login Flow
Combine the commands you've learned to perform a real login operation:
appId: com.example.app
name: Login Flow
env:
USERNAME: test_user
PASSWORD: secret123
---
# 1. Launch the App
- launchApp
# 2. Enter username
- tapOn: "Username"
- inputText: "${USERNAME}"
# 3. Enter password
- tapOn: "Password"
- inputText: "${PASSWORD}"
# 4. Tap the login button (wait for the screen to settle before continuing)
- tapOn:
text: "Login"
waitToSettleTimeoutMs: 3000
# 5. Verify login success
- assertVisible: "Welcome"Ready to write more flexible flows? → Intermediate
