system: validate installed rollback workflow

This commit is contained in:
2026-04-05 01:39:24 +02:00
parent b3b1ba2489
commit 9dae4e5c84
9 changed files with 1170 additions and 75 deletions

View File

@@ -48,13 +48,12 @@ That path remains the active runtime boundary used by activation and service wir
Fruix avoids in-place mutation of an older deployed closure.
The validated rollback story today is:
The validated rollback story now has two layers:
- keep the earlier declaration
- rebuild or rematerialize it
- boot or redeploy that earlier closure again
- declaration-level rollback by rebuilding/redeploying an earlier declaration
- installed-system rollback between already-recorded generations on the target itself
That is Guix-like in spirit even though Fruix does not yet expose the same installed-system rollback command surface.
That is Guix-like in spirit, although Fruix still exposes a smaller installed-system workflow than Guix's more mature `reconfigure` model.
### Generation-style metadata and roots
@@ -78,7 +77,7 @@ Guix heavily reuses its profile-generation model and represents a lot of meaning
Fruix keeps the **semantics** but uses a more explicit metadata-oriented layout for installed systems.
Current Fruix layout:
Current Fruix layout starts as:
```text
/var/lib/fruix/system/
@@ -92,6 +91,20 @@ Current Fruix layout:
install.scm
```
After a validated installed-system switch, Fruix also records:
```text
/var/lib/fruix/system/
rollback -> generations/1
rollback-generation
generations/
2/
closure -> /frx/store/...-fruix-system-...
metadata.scm
provenance.scm
install.scm
```
Why Fruix does this:
- it makes deployment state easier to inspect directly
@@ -154,27 +167,35 @@ Validated examples:
So compared with Guix-on-Linux intuition, Fruix operators should be more explicit about target-device selection during installation and installer-media validation.
## 6. Fruix does not yet have Guix-equivalent installed-system generation commands
## 6. Fruix now has a minimal installed-system generation command surface, but it is still smaller than Guix's
This is the biggest current operational gap.
This remains the biggest operational gap, but it is no longer a complete gap.
Fruix does **not** yet provide a mature equivalent of the familiar Guix System operator flow around in-place generation switching and rollback commands.
Installed Fruix systems now provide a small in-guest helper:
Today, Fruix rollback is mostly:
- `fruix system status`
- `fruix system switch /frx/store/...-fruix-system-...`
- `fruix system rollback`
- declaration-driven
- rebuild/redeploy based
What this gives you today:
rather than:
- explicit current-generation tracking
- explicit rollback-generation tracking
- in-place switching between already-staged closures on the installed target
- rollback without reinstalling the whole system image again
- switch current system generation in place through a dedicated command
What it still does **not** give you yet compared with Guix:
So if you come from Guix, assume that Fruix currently has:
- a mature `reconfigure`-style workflow that builds and stages the new closure from inside the target system
- automatic closure transfer/fetch as part of `switch`
- the broader generation-management UX Guix operators expect
So if you come from Guix, assume that Fruix now has:
- strong closure/store semantics
- explicit install artifacts
- explicit generation metadata roots
- but a less mature installed-system generation UX
- a real but still modest installed-system switch/rollback UX
## Where Fruix is intentionally trying to improve on Guix's representation
@@ -202,10 +223,11 @@ If you are already comfortable with Guix, the safest Fruix mental model today is
- closure path
- source provenance metadata
- install metadata
5. think of rollback today as:
- “redeploy the earlier declaration again”
rather than:
- “switch to an already-managed previous generation in place”
5. think of rollback in two layers:
- if the target already has the desired closure staged locally:
- use `fruix system rollback`
- otherwise:
- redeploy the earlier declaration again
## Status summary
@@ -222,4 +244,4 @@ It differs most from Guix in:
- source-provenance emphasis
- installer-medium-oriented workflows
- generation-layout representation
- and the still-maturing installed-system generation command surface
- and an installed-system generation command surface that now exists, but is still much smaller than Guix's

View File

@@ -30,6 +30,10 @@ Fruix currently has:
- `/var/lib/fruix/system`
- explicit installed-system retention roots under:
- `/frx/var/fruix/gcroots`
- a validated installed-system generation switch/rollback workflow via:
- `fruix system status`
- `fruix system switch`
- `fruix system rollback`
Validated boot modes still are:
@@ -42,30 +46,35 @@ The validated Phase 18 installation work currently uses:
## Latest completed achievement
### 2026-04-04 — Phase 19.2 completed
### 2026-04-04 — Phase 19.3 completed
Fruix now records an explicit installed-system generation layout and retention-root model instead of relying mainly on harness knowledge.
Fruix now has a validated installed-system operator workflow for switching to a staged candidate generation and rolling back to the recorded previous generation.
Highlights:
- added explicit installed-system generation layout under:
- `/var/lib/fruix/system`
- added explicit installed-system retention roots under:
- `/frx/var/fruix/gcroots`
- installed targets now record a first-generation deployment directory containing:
- `closure`
- `metadata.scm`
- `provenance.scm`
- `install.scm`
- `/run/current-system` remains the runtime boundary and still points directly at the active closure path
- added Guix-oriented operator notes in:
- `docs/GUIX_DIFFERENCES.md`
- updated deployment workflow documentation to reflect the new explicit generation model
- installed systems now ship an in-guest Fruix deployment helper at:
- `/usr/local/bin/fruix`
- validated in-guest command surface:
- `fruix system status`
- `fruix system switch /frx/store/...-fruix-system-...`
- `fruix system rollback`
- switching now records explicit rollback state under:
- `/var/lib/fruix/system/rollback`
- `/var/lib/fruix/system/rollback-generation`
- switching now records explicit rollback GC roots under:
- `/frx/var/fruix/gcroots/rollback-system`
- the validated installed-system workflow now supports:
- stage candidate closure in `/frx/store`
- switch to generation 2
- reboot into the candidate
- rollback to generation 1
- reboot into the restored current system
Validation:
- `PASS phase19-generation-layout-qemu`
- regression re-check:
- `PASS phase19-installed-system-rollback-qemu`
- regression re-checks:
- `PASS phase19-generation-layout-qemu`
- `PASS phase18-installer-iso`
Reports:
@@ -74,6 +83,7 @@ Reports:
- `docs/GUIX_DIFFERENCES.md`
- `docs/reports/phase19-deployment-workflow-freebsd.md`
- `docs/reports/phase19-generation-layout-freebsd.md`
- `docs/reports/phase19-installed-system-rollback-freebsd.md`
## Recent major milestones
@@ -99,6 +109,6 @@ Reports:
Per `docs/PLAN_4.md`, the next planned step is:
- **Phase 19.3** — validate installed-system rollback through the intended operator-facing workflow
- **Phase 20.1** — validate a Fruix-managed development environment for native FreeBSD base work
Phase 19.2 is now complete: Fruix has an explicit installed-system generation layout and retention-root model on FreeBSD.
Phase 19.3 is now complete: Fruix validates installed-system generation switching and rollback through the intended operator-facing workflow.

View File

@@ -0,0 +1,226 @@
# Phase 19.3: installed-system rollback workflow on FreeBSD
Date: 2026-04-04
## Goal
Phase 19.3 is about validating installed-system rollback through the intended operator-facing workflow, not only through host-side build/image redeploy harnesses.
The key question was:
- can an already-installed Fruix system move between recorded generations coherently, using an operator-facing command surface on the target itself?
## Decision
The current Fruix solution is intentionally modest.
Fruix now provides a small installed-system helper on the target itself:
- `/usr/local/bin/fruix`
Validated in-guest commands:
- `fruix system status`
- `fruix system switch /frx/store/...-fruix-system-...`
- `fruix system rollback`
Important scope choice:
- `switch` assumes the candidate closure is already present on the target's `/frx/store`
- Fruix does **not** yet fetch or transfer that closure onto the target automatically
That keeps Phase 19.3 focused on generation-state correctness rather than introducing a larger store-transfer story prematurely.
## Implemented model
Installed systems now support the following validated operator pattern:
1. build a candidate closure with the host-side Fruix frontend
2. stage that closure into the installed system's `/frx/store`
3. run:
- `fruix system switch /frx/store/...candidate...`
4. reboot into the candidate generation
5. if needed, run:
- `fruix system rollback`
6. reboot into the recorded rollback generation
The installed system now records explicit rollback state under:
```text
/var/lib/fruix/system/
current -> generations/N
current-generation
rollback -> generations/M
rollback-generation
generations/
...
```
and explicit rollback reachability under:
```text
/frx/var/fruix/gcroots/
current-system -> /frx/store/...current...
rollback-system -> /frx/store/...rollback...
system-1 -> ...
system-2 -> ...
```
## Code changes
### `modules/fruix/system/freebsd/render.scm`
Added a generated in-guest Fruix deployment helper script under:
- `usr/local/bin/fruix`
That helper now:
- reports installed-system state with `fruix system status`
- stages a new current generation with `fruix system switch`
- stages the recorded rollback generation with `fruix system rollback`
- updates:
- `/var/lib/fruix/system/current`
- `/var/lib/fruix/system/current-generation`
- `/var/lib/fruix/system/rollback`
- `/var/lib/fruix/system/rollback-generation`
- `/frx/var/fruix/gcroots/current-system`
- `/frx/var/fruix/gcroots/rollback-system`
- `/frx/var/fruix/gcroots/system-N`
- refreshes the ESP bootloader file from the selected closure's `boot/loader.efi`
A practical implementation detail mattered here:
- replacing `/run/current-system` with a remove-then-recreate strategy caused the live shell environment to break while the link was absent
- switching that update to an atomic symlink replacement path for `/run/current-system` avoided that gap and made the in-guest operator command reliable
### `modules/fruix/system/freebsd/media.scm`
Updated installed rootfs staging so that installed targets expose:
- `/usr/local/bin/fruix -> /run/current-system/usr/local/bin/fruix`
Also bumped the explicit generation-layout version from:
- `1` to `2`
because the installed-system model now includes operator-driven switch/rollback state as part of the validated layout story.
### `modules/fruix/system/freebsd/model.scm`
Updated generated-file metadata so the system closure records:
- `usr/local/bin/fruix`
as part of the generated operating-system file set.
## New validation harness
Added:
- `tests/system/run-phase19-installed-system-rollback-qemu.sh`
This harness validates the actual installed-system operator flow on local `QEMU/UEFI/TCG`.
## Validation flow
The harness now performs all of the following:
1. installs a current system image directly to a target disk image
2. builds a distinct candidate closure
- in the validated harness this differs by host name so the closure identity changes cleanly without needing a heavier base-version rebuild
3. stages the candidate closure and its referenced store items into the installed target's `/frx/store`
4. boots the installed current system
5. validates initial state:
- current generation = `1`
- current closure = current installed closure
- no rollback generation yet recorded
6. runs:
- `fruix system switch /frx/store/...candidate...`
7. validates staged switch state:
- current generation = `2`
- rollback generation = `1`
- current closure = candidate closure
- rollback closure = original current closure
- generation 2 metadata/install files were written
8. reboots and validates boot into the candidate closure
9. runs:
- `fruix system rollback`
10. validates staged rollback state:
- current generation = `1`
- rollback generation = `2`
- current closure = original closure
- rollback closure = candidate closure
11. reboots and validates boot back into the original current system
12. confirms post-rollback service state:
- `fruix-shepherd` running
- `sshd` running
- activation log still shows success
## Passing validation
Passing result:
- `PASS phase19-installed-system-rollback-qemu`
Validated metadata summary:
```text
current_closure_path=/frx/store/4debd106d62f14594ba1612e1e7105f1658bf5f4075d6e5db5436efeaf929d90-fruix-system-fruix-freebsd-current
candidate_closure_path=/frx/store/54fb14e6071b8e5704a5dc75e2881c2f0533767771c26c4181f57afea88d1e8b-fruix-system-fruix-freebsd-canary
current_host_name=fruix-freebsd-current
candidate_host_name=fruix-freebsd-canary
final_current_generation=1
final_current_closure=/frx/store/4debd106d62f14594ba1612e1e7105f1658bf5f4075d6e5db5436efeaf929d90-fruix-system-fruix-freebsd-current
final_rollback_generation=2
final_rollback_closure=/frx/store/54fb14e6071b8e5704a5dc75e2881c2f0533767771c26c4181f57afea88d1e8b-fruix-system-fruix-freebsd-canary
installed_system_switch=ok
installed_system_rollback=ok
```
## Regression checks
After landing the installed-system switch/rollback workflow, the following regression checks still pass:
- `PASS phase19-generation-layout-qemu`
- `PASS phase18-installer-iso`
That means the new in-guest generation-management path did not regress:
- the previously validated explicit generation layout
- or the UEFI installer ISO boot/install path
## Relationship to Guix
This phase does **not** claim that Fruix now matches Guix's full installed-system UX.
What Fruix now has is:
- explicit generation state on disk
- explicit current/rollback pointers
- a minimal installed-system operator command surface
- validated switching and rollback between already-staged closures
What still remains compared with Guix:
- building/staging the candidate closure from inside the target system itself
- automatic closure transfer/fetch as part of `switch`
- a richer long-term generation lifecycle policy
## Conclusion
Phase 19.3 is complete.
Fruix now validates an actual installed-system rollback workflow on FreeBSD:
- the target system itself can report current/rollback state
- it can switch to a staged candidate generation
- it can reboot into that candidate generation
- it can roll back to the recorded prior generation
- and it can reboot into the restored current system
That closes the Phase 19 deployment story from:
- documented deployment workflow
- to explicit generation layout
- to validated installed-system operator rollback behavior

View File

@@ -11,9 +11,10 @@ This document defines the current canonical Fruix workflow for:
- installing a declarative system onto an image or disk
- booting through installer media
- rolling forward to a candidate system
- rolling back to an earlier declared system
- switching an installed system to a staged candidate generation
- rolling an installed system back to an earlier recorded generation
This is the Phase 19.1 operator-facing view of the system model already implemented in earlier phases.
This is now the Phase 19 operator-facing view of the system model as validated through explicit installed-system generation switching and rollback.
## Core model
@@ -168,6 +169,34 @@ Use this when you want to:
- install from an ISO-attached Fruix environment
- test the same install model on more realistic VM paths
### Installed-system generation commands
Installed Fruix systems now also ship a small in-guest deployment helper at:
- `/usr/local/bin/fruix`
Current validated in-guest commands are:
```sh
fruix system status
fruix system switch /frx/store/...-fruix-system-...
fruix system rollback
```
Current intended usage:
1. build a candidate closure on the operator side with `./bin/fruix system build`
2. ensure that candidate closure is present on the installed target's `/frx/store`
3. run `fruix system switch /frx/store/...` on the installed system
4. reboot into the staged candidate generation
5. if needed, run `fruix system rollback`
6. reboot back into the recorded rollback generation
Important current limitation:
- `fruix system switch` does **not** yet fetch or copy the candidate closure onto the target for you
- it assumes the selected closure is already present in the installed system's `/frx/store`
## Deployment patterns
### 1. Build-first workflow
@@ -261,7 +290,7 @@ Installed Fruix systems now record an explicit first-generation deployment layou
- `/var/lib/fruix/system`
Current validated shape:
Initial installed shape:
```text
/var/lib/fruix/system/
@@ -275,6 +304,24 @@ Current validated shape:
install.scm # present on installed targets
```
After a validated in-place switch, the layout extends to:
```text
/var/lib/fruix/system/
current -> generations/2
current-generation
rollback -> generations/1
rollback-generation
generations/
1/
...
2/
closure -> /frx/store/...-fruix-system-...
metadata.scm
provenance.scm
install.scm # deployment metadata for the switch operation
```
Installed systems also now create explicit GC-root-style deployment links under:
- `/frx/var/fruix/gcroots`
@@ -284,7 +331,9 @@ Current validated shape:
```text
/frx/var/fruix/gcroots/
current-system -> /frx/store/...-fruix-system-...
rollback-system -> /frx/store/...-fruix-system-...
system-1 -> /frx/store/...-fruix-system-...
system-2 -> /frx/store/...-fruix-system-...
```
Important detail:
@@ -294,7 +343,9 @@ Important detail:
## Roll-forward workflow
The current Fruix roll-forward model is declaration-driven.
The current Fruix roll-forward model now has two validated layers.
### Declaration/deployment roll-forward
Canonical process:
@@ -314,47 +365,61 @@ Canonical process:
5. boot or install the candidate
6. validate the candidate closure in the booted system
The important property is that the candidate closure appears beside the earlier one in `/frx/store` rather than mutating it in place.
### Installed-system generation roll-forward
When the candidate closure is already present on an installed target:
1. run `fruix system switch /frx/store/...candidate...`
2. confirm the staged state with `fruix system status`
3. reboot into the candidate generation
4. validate the new active closure after reboot
The important property is still that the candidate closure appears beside the earlier one in `/frx/store` rather than mutating it in place.
## Rollback workflow
The current canonical rollback workflow is also declaration-driven.
The current canonical rollback workflow also now has two validated layers.
Today, rollback means:
### Declaration/deployment rollback
You can still roll back by redeploying the earlier declaration:
1. retain the earlier declaration that produced the known-good closure
2. rebuild or rematerialize that earlier declaration
3. redeploy or reboot that earlier artifact again
Concretely, the usual rollback choices are:
Concretely, the usual declaration-level rollback choices are:
- rebuild the earlier declaration with `fruix system build` and confirm the old closure path reappears
- boot the earlier declaration again through `fruix system image`
- reinstall the earlier declaration through `fruix system install`, `installer`, or `installer-iso` if the deployment medium itself must change
This rollback story has already been validated at the closure/image/deployment level:
### Installed-system generation rollback
- side-by-side base-version coexistence in `/frx/store`
- roll-forward to a candidate closure
- rollback by rebuilding and booting the earlier declaration again
- validation on both local QEMU and the approved XCP-ng VM path
When an installed target already has both the current and rollback generations recorded:
1. run `fruix system rollback`
2. confirm the staged state with `fruix system status`
3. reboot into the rollback generation
4. validate the restored active closure after reboot
This installed-system rollback path is now validated on local `QEMU/UEFI/TCG`.
### Important scope note
This is not yet the same thing as a first-class installed-system generation switch command.
This is still not yet the same thing as Guix's full `reconfigure`/generation UX.
Current rollback is:
Current installed-system rollback is intentionally modest:
- **redeploy the earlier declaration again**
What still remains for later Phase 19 work is making rollback itself operator-driven at the installed-system layer, rather than only declaration/redeploy driven.
- it switches between already-recorded generations on the target
- it does not yet fetch candidate closures onto the machine for you
- it does not yet expose a richer history-management or generation-pruning policy
Still pending:
- previous-generation tracking beyond the initial explicit generation-1 layout
- an explicit rollback target link distinct from `current`
- an operator-facing installed-system rollback workflow
- generation switching without full redeploy
- operator-facing closure transfer or fetch onto installed systems
- multi-generation lifecycle policy beyond the validated `current` and `rollback` pointers
- a fuller `reconfigure`-style installed-system UX
## Provenance and deployment identity
@@ -375,16 +440,16 @@ Operators should retain metadata from successful candidate and current deploymen
## Current limitations
The deployment workflow is now coherent, but it is not yet the final generation-management story.
The deployment workflow is now coherent, and Fruix now has a validated installed-system switch/rollback path, but it is still not the final generation-management story.
Not yet first-class:
- a dedicated `switch` or `reconfigure` command
- an installed-system rollback command that moves between generations in place
- multi-generation retention and previous-generation tracking beyond generation 1
- generation switching policy independent of full redeploy
- host-side closure transfer/fetch onto installed systems as part of `fruix system switch`
- a fuller `reconfigure` workflow that builds and stages the new closure from inside the target environment
- multi-generation lifecycle policy beyond the validated `current` and `rollback` pointers
- generation pruning and retention policy independent of full redeploy
Those are the next logical steps after the current explicit-generation layout.
Those are the next logical steps after the current explicit-generation switch/rollback model.
## Summary
@@ -395,6 +460,8 @@ The current canonical Fruix deployment model is:
- **materialize** the artifact appropriate to the deployment target
- **boot or install** that artifact
- **identify deployments by closure path and provenance metadata**
- **roll back by rebuilding/redeploying the earlier declaration**, not by mutating the current closure in place
- on installed systems, **switch** to a staged candidate with `fruix system switch`
- on installed systems, **roll back** to the recorded rollback generation with `fruix system rollback`
- still use declaration/redeploy rollback when the target does not already have the desired closure staged locally
That is the operator-facing workflow Fruix should document and use while installed-system generation switching remains more limited than Guix's mature in-place system-generation workflow.
That is the operator-facing workflow Fruix should document and use while its installed-system generation UX remains simpler than Guix's mature in-place system-generation workflow.