openwrt-tests CI execution flow¶

How a CI run works end-to-end: from a GitHub Actions trigger to a firmware test on a physical DUT in a remote lab.

Companion to openwrt-tests onboarding (infrastructure setup) and lab architecture (VLAN and coordinator design).

1. Two independent planes¶

The flow combines two separate communication channels that should not be confused:

Plane	Protocol	Purpose
Control	WebSocket (port 20408)	Coordinator manages places, reservations, locks. `labgrid-client` and the Labgrid pytest plugin use this.
Hardware access	SSH over WireGuard	Runner reaches the lab host to access physical resources (serial, power, DUT SSH). The coordinator is not involved for this.

2. Components per location¶

Datacenter VM (global-coordinator)¶

Component	Role
`labgrid-coordinator` (port 20408)	WebSocket server. Registers places from `places.yaml`, tracks locks and reservations.
`places.yaml`	Generated by Ansible from `labnet.yaml`. Lists every place (DUT) across all labs.
GitHub Actions self-hosted runners	Processes that poll GitHub via HTTPS and execute workflow jobs. Labeled `global-coordinator`.
WireGuard peers	One per lab host. Gives each lab a private IP reachable from the VM.

Each lab host (e.g. `labgrid-fcefyn`, `labgrid-aparcar`)¶

Component	Role
`labgrid-exporter`	Registers local DUT resources (serial, power, network) with the coordinator over WebSocket.
`exporter.yaml`	Declares the physical resources of each place: USB serial path, PDUDaemon port, DUT IP+VLAN interface.
`pdudaemon`	Controls DUT power via relay or PDU. Exposes an HTTP API on `localhost:16421`.
`ser2net`	Exposes USB serial ports as TCP sockets. Used by Labgrid `SerialDriver`.
`dnsmasq`	DHCP + TFTP server per VLAN. DUTs boot initramfs via TFTP.
`labgrid-bound-connect`	SSH proxy command (runs as `sudo`). Bridges a TCP connection to a DUT IP bound to a specific VLAN interface using `socat`.
WireGuard peer	Tunnel to the VM.

flowchart TD
    subgraph vm ["Datacenter VM (public IP)"]
        RUNNERS["GitHub runners"]
        COORD["labgrid-coordinator :20408"]
    end

    subgraph lab ["Lab host (e.g. labgrid-fcefyn)"]
        EXP["labgrid-exporter"]
        BC["labgrid-bound-connect"]
        PDU["pdudaemon :16421"]
        SER["ser2net"]
        DNS["dnsmasq / TFTP"]
        DUTs["DUTs (192.168.1.1%vlanXXX)"]
    end

    RUNNERS -- "1" --> COORD
    EXP -- "2" --> COORD
    RUNNERS -- "3" --> BC
    BC -- "4" --> DUTs
    PDU -- "5" --> DUTs
    SER -- "5" --> DUTs
    DNS -- "5" --> DUTs

Hold "Alt" / "Option" to enable pan & zoom

#	Connection	Detail
1	Runners → Coordinator	WebSocket localhost:20408 (reserve / lock / unlock)
2	Exporter → Coordinator	WebSocket via WireGuard (register resources)
3	Runners → bound-connect	SSH via WireGuard (LG_PROXY)
4	bound-connect → DUTs	socat TCP bound to correct VLAN interface
5	Local services → DUTs	pdudaemon (power), ser2net (serial), dnsmasq (DHCP/TFTP)

All connections between the VM and the lab host traverse a WireGuard tunnel. places.yaml on the VM is generated by Ansible from labnet.yaml.

3. Matrix strategy: one job per device¶

The generate-matrix job reads labnet.yaml and produces a JSON list of all devices across all labs. GitHub Actions expands it into parallel jobs, one per device.

flowchart LR
    LN["labnet.yaml\n(devices + labs)"]
    GM["generate-matrix job\n(ubuntu-latest)"]
    J1["Job: openwrt_one\n(labgrid-fcefyn)\nruns-on: global-coordinator"]
    J2["Job: bananapi_bpi-r4\n(labgrid-fcefyn)\nruns-on: global-coordinator"]
    J3["Job: linksys_e8450\n(labgrid-hauke)\nruns-on: global-coordinator"]

    LN --> GM
    GM -- "matrix JSON" --> J1
    GM -- "matrix JSON" --> J2
    GM -- "matrix JSON" --> J3

Hold "Alt" / "Option" to enable pan & zoom

Each job receives its own matrix.device, matrix.proxy, matrix.target, matrix.firmware values.

4. Environment variables and `$GITHUB_ENV`¶

Variables are passed between steps via $GITHUB_ENV: a temporary file the runner creates per job. Each echo "VAR=val" >> $GITHUB_ENV writes a line; the runner reads the file after each step and injects the variables into the next step's process environment.

sequenceDiagram
    participant GH as GitHub Actions
    participant Step1 as Step: Set environment
    participant Step2 as Step: Wait for free device
    participant Step3 as Step: Run test
    participant Step4 as Step: Poweroff and unlock

    GH->>Step1: matrix.device, matrix.proxy, matrix.firmware, matrix.version_url
    Step1->>Step1: wget firmware from OpenWrt mirrors
    Step1-->>GH: LG_IMAGE=/path/to/firmware<br/>LG_PROXY=labgrid-fcefyn<br/>(via $GITHUB_ENV)

    GH->>Step2: (reads LG_PROXY from env)
    Step2->>Step2: labgrid-client reserve --wait --shell device=X
    Note right of Step2: eval sets LG_TOKEN in current shell
    Step2-->>GH: LG_TOKEN=xxx<br/>LG_PLACE=+<br/>LG_ENV=targets/device.yaml<br/>(via $GITHUB_ENV)
    Step2->>Step2: labgrid-client -p +$LG_TOKEN lock

    GH->>Step3: (reads all LG_* from env)
    Step3->>Step3: uv run pytest tests/

    GH->>Step4: (reads LG_TOKEN from env)
    Step4->>Step4: labgrid-client power off
    Step4->>Step4: labgrid-client -p +$LG_TOKEN unlock

Hold "Alt" / "Option" to enable pan & zoom

Variable	Set by	Used by	Value
`LG_IMAGE`	Step "Set environment"	Labgrid plugin (`!template $LG_IMAGE` in target YAML)	Local path to firmware file
`LG_PROXY`	Step "Set environment"	`labgrid-client`, Labgrid plugin	Lab proxy name (e.g. `labgrid-fcefyn`)
`LG_TOKEN`	Step "Wait for free device" via `eval`	`labgrid-client lock/unlock`	Reservation token from coordinator
`LG_PLACE`	Step "Wait for free device"	Labgrid plugin (`!template "$LG_PLACE"`)	`+` (active reservation)
`LG_ENV`	Step "Wait for free device"	Labgrid plugin	`targets/<device>.yaml`
`LG_COORDINATOR`	Not set (uses default)	`labgrid-client`, Labgrid plugin	`localhost:20408` (runner and coordinator share the VM)

`!template` in target YAML¶

Target files (targets/<device>.yaml) use !template to expand environment variables at Labgrid load time:

resources:
  RemotePlace:
    name: !template "$LG_PLACE"   # expands to "+"

images:
  root: !template $LG_IMAGE       # expands to /path/to/firmware

5. Full CI sequence¶

sequenceDiagram
    participant GH as GitHub (HTTPS)
    participant R as Runner (VM)
    participant LC as labgrid-client
    participant COORD as labgrid-coordinator (VM :20408)
    participant PP as pytest + Labgrid plugin
    participant LAB as Lab host
    participant DUT as DUT

    GH-->>R: trigger job (HTTPS poll)
    R->>R: checkout repo, install uv

    Note over R: Step "Set environment"
    R->>R: wget firmware from mirrors
    R->>R: write LG_IMAGE, LG_PROXY to $GITHUB_ENV

    Note over R: Step "Wait for free device"
    R->>LC: labgrid-client reserve --wait --shell device=X
    LC->>COORD: WebSocket: reserve(device=X)
    COORD-->>LC: LG_TOKEN=xxx (place allocated)
    LC->>COORD: WebSocket: lock(+$LG_TOKEN)
    R->>R: write LG_TOKEN, LG_PLACE, LG_ENV to $GITHUB_ENV

    Note over R: Step "Run test"
    R->>PP: uv run pytest tests/
    PP->>COORD: WebSocket: getResources(place=+)
    COORD-->>PP: serial=USBSerial@lab, PDU=localhost:16421@lab, IP=192.168.1.1%vlanXXX
    PP->>LAB: SSH (LG_PROXY=labgrid-fcefyn via WireGuard)
    LAB->>LAB: labgrid-bound-connect (socat TCP → 192.168.1.1:22 bound to vlanXXX)
    LAB->>DUT: TCP connection to DUT
    PP-->>DUT: flash firmware via TFTP + U-Boot serial
    PP-->>DUT: SSH to DUT (via bound-connect)
    DUT-->>PP: test results

    Note over R: Step "Poweroff and unlock"
    R->>LC: labgrid-client power off
    LC->>COORD: WebSocket: call PDUDaemonDriver.off
    COORD->>LAB: forward to exporter → pdudaemon → relay → DUT power off
    R->>LC: labgrid-client unlock +$LG_TOKEN
    LC->>COORD: WebSocket: unlock(+$LG_TOKEN)

Hold "Alt" / "Option" to enable pan & zoom

6. Role of `labgrid-bound-connect`¶

labgrid-bound-connect is a Python script installed on each lab host (not on the VM). It is invoked as an SSH ProxyCommand when the runner opens a connection to a DUT.

Problem it solves: all DUTs share the same IP (192.168.1.1), each on a different VLAN interface (vlan100, vlan101, ...). A normal TCP connect from the lab host would use the default route and miss the right VLAN. The script uses socat with so-bindtodevice=vlanXXX to force the connection out through the correct interface.

Runner (VM)
  └── SSH → lab host (via WireGuard)
              └── labgrid-bound-connect vlan101 192.168.1.1 22
                    └── socat STDIO TCP4:192.168.1.1:22,so-bindtodevice=vlan101
                          └── DUT on VLAN 101

The script runs under sudo (passwordless via /etc/sudoers). It is deployed by the Ansible playbook to /usr/local/sbin/labgrid-bound-connect on each lab host.

7. Summary: who calls what¶

Action	Caller	Target	Protocol
Reserve place	`labgrid-client` (runner)	coordinator	WebSocket
Lock place	`labgrid-client` (runner)	coordinator	WebSocket
Get resources for place	Labgrid pytest plugin	coordinator	WebSocket
SSH to DUT	Labgrid pytest plugin	lab host → DUT	SSH over WireGuard + bound-connect
Serial access	Labgrid pytest plugin	lab host ser2net	TCP over WireGuard
Power control	`labgrid-client` / plugin	coordinator → exporter → pdudaemon	WebSocket → HTTP
Unlock place	`labgrid-client` (runner)	coordinator	WebSocket
Register resources	`labgrid-exporter` (lab)	coordinator	WebSocket (outbound from lab)

openwrt-tests CI execution flow¶

1. Two independent planes¶

2. Components per location¶

Datacenter VM (global-coordinator)¶

Each lab host (e.g. labgrid-fcefyn, labgrid-aparcar)¶

3. Matrix strategy: one job per device¶

4. Environment variables and $GITHUB_ENV¶

!template in target YAML¶

5. Full CI sequence¶

6. Role of labgrid-bound-connect¶

7. Summary: who calls what¶

Each lab host (e.g. `labgrid-fcefyn`, `labgrid-aparcar`)¶

4. Environment variables and `$GITHUB_ENV`¶

`!template` in target YAML¶

6. Role of `labgrid-bound-connect`¶