Drop provider account ids from runtime config and node-create CLI calls. Update sanitized Legion state handling for the providers field and keep helper tests portable on macOS temp paths.
tribes-supertest
Real integration scenarios for Tribes deployments, driven through a checked-out Legion CLI.
Overview
This repo runs real deployment scenarios against cloud infrastructure and verifies what the deployed nodes actually booted and exposed.
../legion_kkis the only required external project.- Legion is treated as the deployment authority.
- supertest records what Legion asked Guix to install and what the nodes actually booted.
The runner currently invokes Legion's headless CLI directly:
- CLI entrypoint:
../legion_kk/src/engine/cli-main.ts - State: isolated per run under
.state/supertest/... - Artifacts: command logs, sanitized Legion state, and remote node diagnostics
Requirements
- This repo checked out locally
../legion_kkchecked out beside it- Node/npm available in the current shell
- Legion dependencies available in
../legion_kk - A usable Legion kexec installer default, normally the generated mirror pin in
../legion_kk
Cloud/provider credentials must be present in the environment:
LEGION_UNLOCK_PASSWORDHCLOUD_TOKENOVH_APP_KEYorOVH_APPLICATION_KEYOVH_APP_SECRETorOVH_APPLICATION_SECRETOVH_CONSUMER_KEYSCW_ACCESS_KEYSCW_SECRET_KEYSCW_DEFAULT_PROJECT_ID
Optional but commonly useful:
SCW_DEFAULT_ZONEOVH_ENDPOINTSUPERTEST_KEEP_NODES=1SUPERTEST_KEXEC_IMAGE=/abs/path/to/guix-kexec-installer.tar.gzorSUPERTEST_KEXEC_IMAGE=https://mirror.example/tribes-1/guix-kexec-installer-x86_64-linux-latest.tar.gzSUPERTEST_CERT_MODE=self-signed(test-only mode to skip ACME and keep self-signed edge certs)hcloudandscwCLI tooling in the shell for manual inspection or intervention
Install
Install this repo's dependencies:
npm install
If you want to check the project locally before a live run:
npm run typecheck
npm test
npm run build
Check the Guix substitute servers before spending cloud time:
npm run preflight:substitutes
npm run preflight:substitutes -- --plugin sender
Delete leftover cloud resources without using Legion state:
scripts/cleanup-cloud-resources
scripts/cleanup-cloud-resources --dry-run
Basic Usage
List scenarios:
npm run scenario:list
Run the single-node scenario:
npm run scenario:single-node-init
Run the manual single-node scenario against an existing host:
SUPERTEST_MANUAL_HOST_IP=203.0.113.10 \
SUPERTEST_MANUAL_USERNAME=ubuntu \
SUPERTEST_MANUAL_PASSWORD=secret \
npm run scenario:manual-node-init
Run the single-node plugin rollout/rollback scenario:
npm run scenario:single-node-plugin-rollout-rollback
Run the single-node Sender ingest/HLS scenario:
npm run scenario:single-node-sender
Run the clustered Sender fanout/reboot scenario:
npm run scenario:cluster-sender-fanout-reboot
Run the cluster lifecycle scenario:
npm run scenario:cluster-lifecycle
Run the clustered plugin sync split-brain scenario:
npm run scenario:cluster-plugin-rollout-sync-split-brain
Keep created nodes around for inspection:
SUPERTEST_KEEP_NODES=1 npm run scenario:cluster-lifecycle
Run one scenario directly with the generic entrypoint:
npm run scenario -- single-node-init
npm run scenario -- manual-node-init
npm run scenario -- single-node-plugin-rollout-rollback
npm run scenario -- single-node-sender
npm run scenario -- cluster-sender-fanout-reboot
npm run scenario -- cluster-plugin-rollout-sync-split-brain
npm run scenario -- cluster-lifecycle
Dev Branch Helper
For rapid guix-tribes dev-channel iteration, use:
scripts/test-dev-branch --plugin supertest single-node-plugin-rollout-rollback
The helper updates and pushes the hard-coded guix-tribes supertest-dev
branch, exports the required SUPERTEST_GUIX_TRIBES_* environment, and runs
the scenario with SUPERTEST_CERT_MODE=self-signed by default. It verifies that
the pinned tribes and optional tribes-plugin-$NAME commits are already
reachable from their origin remotes; it does not push those source repos.
Useful subcommands:
scripts/test-dev-branch prepare --plugin supertest
scripts/test-dev-branch reset
scripts/test-dev-branch env
scripts/test-dev-branch ssh <node>
scripts/test-dev-branch rpc <node> -- 'Node.self()'
Use --tribes-repo or --plugin-repo to point at a clean worktree when the
main checkout contains unrelated local work.
Implemented Scenarios
single-node-initProvisions one Hetzner init node and captures deployed channels, service status, and NBDE state.manual-node-initImports one existing Ubuntu-compatible host fromSUPERTEST_MANUAL_HOST_IP,SUPERTEST_MANUAL_USERNAME, andSUPERTEST_MANUAL_PASSWORD, then captures deployed channels, service status, and NBDE state.single-node-plugin-rollout-rollbackProvisions one Hetzner init node, applies a plugin rollout through the public admin API, and rolls back to the pre-rollout generation.single-node-senderProvisions one Hetzner init node, installs thesenderplugin when needed, starts RTMP ingest through Legion, publishes an audio test stream, verifies HLS playlist and segment output, and stops the stream.cluster-sender-fanout-rebootBuilds a three-node OVH/Hetzner/Scaleway cluster, installs thesenderplugin when needed, starts one origin plus two HLS edges through Legion, publishes a 6 Mbit/s video test stream, runs HLS viewers against every node, verifies Sender viewer-count rollups through the admin API, reboots one edge, and verifies recovery. SetSUPERTEST_SENDER_VIEWERS_PER_NODEto override the default of 3 viewers per node.cluster-plugin-rollout-sync-split-brainBuilds a three-node Hetzner/Scaleway/OVH cluster, applies thesupertestplugin rollout, validates synced table writes and cluster pubsub across a temporary sync-port partition/rejoin, and rolls back every node.cluster-lifecycleBuilds a mixed Hetzner/Scaleway cluster, removes a node, reconciles NBDE, adds a replacement node, and reconciles again.
Rollout cross-repo implementation notes + progress tracker:
docs/ROLLOUT_CROSS_REPO_PLAN_PROGRESS.md
Artifacts
Each run writes to:
.state/supertest/<run-id>-<scenario>/
Inside a scenario directory you will typically find:
config-summary.jsonscenario.jsonlegion-checkout.jsoncommands/snapshots/
Snapshots include:
nodes-list.jsonproviders-list.jsonlegion-state.raw.jsonlegion-state.sanitized.jsonremote/<node-id>/...
Remote diagnostics currently include checks such as:
guix system describeherd status tribesherd status postgrescurl http://127.0.0.1:4000/healthz/root/legion/tribes-admin.sh bootstrap-ready- node public key
- LUKS UUID
- local boot-key presence
- clevis bindings
- peer Tang reachability
Important Notes
- supertest uses isolated Legion state per run. It does not reuse your normal Legion desktop state.
guix system describeis the relevant proof for the installed system channels.guix describeis not sufficient for that check and should not be used for scenario assertions here.- With
SUPERTEST_KEEP_NODES=1, cleanup is skipped on purpose.
Cleanup
If a kept run leaves nodes behind, destroy them using the same isolated Legion state directory that created them.
Example:
env \
LEGION_STATE_DIR=.state/supertest/<run>/single-node-init/legion-state \
LEGION_CACHE_DIR=.state/supertest/<run>/single-node-init/legion-cache \
LEGION_APP_ROOT=../legion_kk \
LEGION_UNLOCK_PASSWORD="$LEGION_UNLOCK_PASSWORD" \
node --import tsx ../legion_kk/src/engine/cli-main.ts \
nodes destroy --materialize <node-id>
To inspect remaining tracked nodes in that state:
env \
LEGION_STATE_DIR=.state/supertest/<run>/single-node-init/legion-state \
LEGION_CACHE_DIR=.state/supertest/<run>/single-node-init/legion-cache \
LEGION_APP_ROOT=../legion_kk \
LEGION_UNLOCK_PASSWORD="$LEGION_UNLOCK_PASSWORD" \
node --import tsx ../legion_kk/src/engine/cli-main.ts \
nodes list --json
If a run has to be aborted hard, or Legion state no longer matches provider reality, wipe provider-side resources directly:
scripts/cleanup-cloud-resources
This helper does not read or delete Legion state or supertest artifacts. It
uses the provider CLIs from the current shell to remove billable runtime
resources such as instances, volumes, snapshots, IP allocations, and load
balancers. By default it preserves non-test SSH keys; use --all-keys if the
provider credentials' project should be fully cleared.
Manual Intervention
The normal control path is Legion's CLI, but it is useful to have provider tooling available for inspection or emergency cleanup.
hcloudUseful for checking server status, IPs, volumes, rescue state, and deleting instances directly if Legion state and provider reality diverge.scwUseful for checking Scaleway instances, IPs, volumes, bootscripts, and deleting or inspecting resources outside the test runner.
These tools are optional for normal runs, but they are practical when:
- a run is kept with
SUPERTEST_KEEP_NODES=1 - a deployment fails halfway through and you want provider-side visibility
- Legion state needs to be compared against provider-side reality
- cleanup has to be completed manually
Useful Environment Variables
SUPERTEST_RUN_IDOverride the generated run id.SUPERTEST_ARTIFACT_ROOTOverride the artifact directory root.SUPERTEST_TRIBE_NAMEOverride the configured tribe name.SUPERTEST_TRIBE_DOMAINOverride the configured tribe domain.SUPERTEST_ACME_EMAILOverride the ACME email passed to Legion.SUPERTEST_CERT_MODEacme(default) orself-signed. When set toself-signed, supertest injectsLEGION_TEST_CERT_MODE=self-signedfor Legion CLI commands.SUPERTEST_HETZNER_INSTANCEOverride the Hetzner instance/offer selection.SUPERTEST_OVH_INSTANCEOverride the OVH instance/offer selection.SUPERTEST_SCALEWAY_INSTANCEOverride the Scaleway instance/offer selection.SUPERTEST_HETZNER_BOOT_MODEOverride Hetzner boot mode.SUPERTEST_SCALEWAY_BOOT_MODEOverride Scaleway boot mode.SUPERTEST_BOOTSTRAP_PASSWORD_ENVOverride the env var name used for Legionconfig init.SUPERTEST_PLUGIN_NAMEOverride the plugin name used by plugin rollout scenarios (defaults tosupertest).