# Tribes local-control API The local-control broker is a small Guile daemon listening on a Unix-domain socket. It fronts every operator action that a Tribes deployment can take on its own host: - **resolve** a `SystemTarget` into a build plan. - **prepare** a build (pull channels + `guix system build`) without activating it. - **commit** a previously-prepared generation (`guix system switch-generation`). - **rollback** to a retained store path or, failing that, rebuild from a plan and switch. - **abort** an in-flight job. - discover channel update candidates from Guix's existing Git checkouts. - inspect **status** and **generations**. This document specifies the wire schema. The BEAM client at `tribes/lib/tribes/local_control.ex` should be updated to match it. ## Transport - HTTP/1.1 over a Unix-domain socket. The path is configurable via `TRIBES_LOCAL_CONTROL_SOCKET` (default `/var/run/tribes/local-control.sock`). - Permissions: socket owned by `root:tribes`, mode `0660`. - Request bodies are JSON (`Content-Type: application/json`). - Responses are JSON. ## Concurrency model The broker runs a single POSIX worker thread. The HTTP request thread is never blocked on a long-running Guix call: any operation that may exceed about a second (`prepare`, `commit`, `rollback`) is enqueued on the worker and returns `202 Accepted` immediately. The caller then polls `GET /v1/deployment/status` for completion. There is at most one job in flight at any time. A new submission with the same `plan_hash` as the running job is **idempotent**: the broker returns the in-flight snapshot rather than queuing a duplicate. A submission with a different `plan_hash` while another job runs returns `409 busy`. ## Endpoints ### `GET /v1/deployment` and `GET /v1/deployment/status` Returns a status snapshot. Polling interval recommendation: 1 s during an active job, with linear back-off to 5 s after the first minute of polling. Snapshot fields: - `schemaVersion` — string, currently `"2"`. - `ok` — boolean. - `status` — high-level state. One of: `idle | queued | running | pulling | building | switching | completed | failed | aborted`. - `phase` — fine-grained phase identical to `status` for in-flight jobs; `ready` after a successful `prepare`, `active` after a successful `commit`/`rollback`. - `job_id` — opaque identifier of the in-flight or last-completed job. `"job-N"` where N is monotonic for the broker process lifetime. - `plan_hash` — the plan hash this job is operating on. - `started_at`, `last_event_at` — RFC 3339 timestamps. - `store_path` — the deployment target's `/gnu/store/...-system` path: the prepared store path after `prepare`, or the selected profile store path after `commit`/`rollback`. - `selectedSystem` — canonical `/gnu/store/...-system` path currently selected by `/var/guix/profiles/system`. - `runningSystem` — canonical `/gnu/store/...-system` path currently exposed by `/run/current-system`. - `generation_number` — the system profile generation number. - `gc_pinned` — boolean. `true` when the broker holds a GC root via `--root=` so the prepared system is not collected before a `commit`. - `built_at`, `activated_at` — RFC 3339 timestamps when present. - `code` — typed error code on failure (see *Error taxonomy*). - `reason` — human-readable error message on failure. - `plugins` — array of plugin names in the deployed plan. ### `GET /v1/deployment/generations` Returns the current system channel provenance plus the list of recorded generations in newest-first order. The top-level `current_channels` field is parsed from `/run/current-system/channels.scm` when present and lets callers identify the initial installed channel pins before local-control has prepared its first generation. Each generation entry: ```json { "store_path": "/gnu/store/...-system", "generation_number": 42, "plan_hash": "plan-abcd...", "status": "active" | "ready" | "superseded", "gc_pinned": true, "built_at": "2026-04-25T13:01:02Z", "activated_at": "2026-04-25T13:01:42Z", "channels": [ { "channel_id": "guix-tribes", "name": "tribes", "url": "https://git.example.test/tribes/guix-tribes.git", "branch": "master", "commit": "abc123...", "position": 10 } ] } ``` `channels` is present for generations prepared by local-control from a plan that included `resolved_channels`. After `guix pull` succeeds, local-control records the pulled profile's `guix describe --format=json` commit for each matching channel, so branch-based plans become exact generation pins. Active generation `channels` are the preferred source for the currently installed channel commit; callers can fall back to top-level `current_channels` for the initial non-local-control install. ### `POST /v1/channels/updates` Synchronous. Discovers update candidates for configured channels by using the Guix channel Git checkouts under `$XDG_CACHE_HOME/guix/checkouts` or `$HOME/.cache/guix/checkouts`. The endpoint does not maintain its own checkout or update database; it locates the checkout whose `remote.origin.url` matches the requested channel URL, runs `git fetch --tags --prune origin`, and inspects Git refs directly. Body: ```json { "mode": "semver_tags", "limit": 20, "channels": [ { "id": "...", "name": "tribes", "url": "https://git.example.test/tribes/guix-tribes.git", "branch": "master", "current_commit": "abc123..." } ] } ``` Response: ```json { "schemaVersion": "1", "ok": true, "mode": "semver_tags", "channels": [ { "id": "...", "name": "tribes", "url": "https://git.example.test/tribes/guix-tribes.git", "branch": "master", "ok": true, "current_commit": "abc123...", "branch_head": "def456...", "candidates": [ { "tag": "v1.2.3", "commit": "def456...", "short_commit": "def4567", "subject": "release 1.2.3", "message": "release 1.2.3\n", "committed_at": "2026-06-07T10:00:00+00:00" } ] } ] } ``` Supported modes: - `semver_tags` — default. Candidates are tags matching `vMAJOR.MINOR.PATCH` with optional prerelease/build suffixes, reachable from the configured branch head, and descendants of `current_commit` when one is provided. - `commits` — advanced mode. Candidates are recent branch commits after `current_commit` when it is an ancestor of the branch head, otherwise recent commits from the branch head. Guix channel authentication remains enforced later by `deployment/prepare`; this endpoint is discovery only. Per-channel failures are returned inline with `ok: false` and an error code, e.g. `checkout_not_found`, `fetch_failed`, `branch_not_found`, or `unsupported_mode`. ### `POST /v1/deployment/resolve` Synchronous. Body: a `SystemTarget` JSON object. Response: - `200` with `{ "schemaVersion": "2", "ok": true, "plan": { ... } }` on success. The `plan` object includes a `plan_hash` and is suitable for feeding into `prepare`. - `409` with the resolver error envelope on capability/manifest/trust failures. ### `POST /v1/deployment/prepare` Asynchronous. Body: a plan object containing `plan_hash` and `resolved_plugins`. - `202` with `{ "schemaVersion": "2", "status": "queued", "job_id": "...", "plan_hash": "...", "started_at": "..." }` on accept (or on idempotent re-submit of the running job). - `409` with `{ "ok": false, "status": "busy", "reason": "deployment already in progress", "job_id": "...", "plan_hash": "...", ... }` when another plan is already in flight. - `400` on validation error. The job pulls channels, runs `guix system build --root=...`, pre-realizes the target system closure and the store inputs needed for the post-switch Shepherd service-definition upgrade, registers the resulting GC root, and records a `ready` generation. Keeping this work in `prepare` means missing substitutes or unexpectedly large local builds fail before the system profile is switched. The final snapshot is visible at `GET /v1/deployment/status`. ### `POST /v1/deployment/commit` Asynchronous. Body: `{ "plan_hash": "..." }`. - `202` on accept. The job switches the system profile to the previously-prepared generation, then re-runs activation and Guix's normal Shepherd service-definition upgrade step inside the pulled/current Guix profile used for the prepare build. Activation runs with `GUIX_NEW_SYSTEM` set to the selected generation so `/run/current-system` follows the profile, and the NBDE boot-store activation hook copies GRUB-referenced `/gnu/store` items into `/boot` for nodes whose real store is on encrypted root. Like upstream `guix system reconfigure`, this does not imply that every already-running service process was restarted. Tribes may then schedule an asynchronous `tribes` service restart as part of higher-level rollout convergence, while `tribes-local-control` self-update remains a separate deferred concern. On later boots, `tribes-boot-start` starts the app only after Legion-managed secret files exist, keeping the first secrets-free boot quiet while allowing reboot recovery. On success the snapshot reaches `phase: "active"` with `status: "completed"`. - `409` if no generation is prepared for that `plan_hash`. The snapshot's error code is `generation_not_prepared`. - `409 busy` if another job is in flight. ### `POST /v1/deployment/rollback` Asynchronous. Body: ```json { "store_path": "/gnu/store/...-system", "plan": { ...optional fallback plan... } } ``` The broker walks these cases in order: 1. The requested `store_path` is the selected system → just record the activation, no build, no switch. 2. We have a recorded local-control generation number for that `store_path` → switch to it directly. 3. The `store_path` appears in Guix's system profile links (`/var/guix/profiles/system-*-link`), even if local-control did not record it → switch to that profile generation directly. This covers the installed baseline generation used by emergency/public rollback. 4. The store path is gone but `plan` is supplied → re-prepare and commit. If none apply the snapshot reports `code: "rollback_infeasible"`. Current limitation: rollback does not run core/plugin down migrations. The public Tribes admin rollback flow currently omits the fallback `plan` on purpose so explicit rollback to a baseline generation cannot replay the rollout being rolled back. ### `POST /v1/deployment/abort` Synchronous. Marks the in-flight job as aborted and writes a snapshot with `status: "aborted"`. (v1: does not yet SIGTERM a running helper subprocess — the operation completes when the helper next checks back in.) ## Error taxonomy Every failed operation returns a `code` matching one of these tokens: - `channel_untrusted` — channel references a signer not in the `TrustedSigner` table. - `signature_invalid` — a channel's commit signature failed verification. - `channel_commit_unreachable` — the configured commit cannot be fetched from the channel URL. - `missing_capability` — a plugin requires a capability that no other plugin provides. - `host_capability_missing` — the pinned host and built-in plugin manifests have an unsatisfied capability contract. - `capability_cycle` — the plugin capability graph contains a cycle. - `duplicate_plugin` — the system target lists the same plugin twice. - `manifest_invalid` — a requested plugin name is unknown to the channel registry. - `host_api_mismatch` — the resolved plan needs a host API version the node cannot honour. - `migration_target_conflict` — two plugins disagree about a migration target version. - `build_failed` — `guix system build` returned non-zero. - `system_closure_preload_failed` — the prepared system's referenced store closure could not be realized before switching. - `service_upgrade_preload_failed` — the post-switch Shepherd service-definition upgrade inputs could not be realized before switching. - `switch_failed` — `guix system switch-generation` returned non-zero. - `rollback_infeasible` — the broker cannot reach the requested store path by either retained generation or rebuild. - `helper_crashed` — `tribes-guix-helper` exited without emitting a structured terminal frame. - `busy` — another job is in flight; the request was rejected. - `invalid_request` — payload missed a required field or violated a limit. ## Helper protocol (internal) The broker spawns `tribes-guix-helper` for every long operation and parses its stdout as NDJSON. The helper emits one of: ```json {"event":"phase","phase":"pulling","ts":"..."} {"event":"phase","phase":"building","ts":"...","derivation":"/gnu/store/..."} {"event":"done","store_path":"/gnu/store/...","generation_number":42,"ts":"..."} {"event":"error","code":"channel_commit_unreachable","message":"...","details":{...},"ts":"..."} ``` The broker uses the last `event: "phase"` frame to update its snapshot in real time, and the final `done` or `error` frame to compute the operation result. If the helper exits without a terminal frame the broker synthesizes `{ "code": "helper_crashed", "details": { "exit_status": N, "signal": S } }`. This protocol is not part of the public API; it exists so the broker can stay small while still surfacing typed errors instead of regex-parsing `guix` stderr.