4.0 KiB
Post-Phase-10: Shepherd-as-PID-1 boot validated on the real XCP-ng FreeBSD VM
Date: 2026-04-02
Goal
Take the locally validated Shepherd-as-PID-1 Fruix boot prototype and test it on the real operator-approved XCP-ng VM.
Target objects remained the same constrained deployment path used for Phase 9:
- VM:
90490f2e-e8fc-4b7a-388e-5c26f0157289 - VDI:
0f1f90d3-48ca-4fa2-91d8-fc6339b95743
The concrete goal for this subphase was to confirm that the new shepherd-pid1 init mode was not merely a local QEMU curiosity, but could also:
- boot on the real Xen guest,
- reach DHCP and SSH,
- keep Shepherd running as PID 1,
- and still reach the Fruix ready marker.
Result
The real XCP-ng boot succeeded.
A new deployment/validation harness was added:
tests/system/run-phase11-shepherd-pid1-xcpng.sh
This harness reuses the existing real-VM deployment method:
- build a full-size image matching the existing VDI
- convert it to dynamic VHD
- overwrite the existing VDI
- boot the real VM
- rediscover the guest by MAC/IP
- validate the booted guest over SSH
The new Shepherd-PID-1 image passes that full path.
Validation
Passing real-VM run:
PASS phase11-shepherd-pid1-xcpng- workdir:
/tmp/pid1-xcpng-1775129768
Validated metadata from the real guest:
ready_marker=ready
run_current_system_target=/frx/store/2940c952e9d35e47f98fe62f296be2b6ab4fceb3eee8248d6a7823decd42a305-fruix-system-fruix-freebsd
pid1_command=[guile]
shepherd_pid=1
shepherd_socket=present
shepherd_status=running
sshd_status=running
guest_ip=192.168.213.62
boot_backend=xcp-ng-xo-cli
init_mode=shepherd-pid1
The key architectural confirmation is:
shepherd_pid=1
That shows the running Shepherd instance in the real guest is PID 1.
As in the local QEMU prototype, the process image is Guile because Shepherd is launched as a Guile script; however, the service manager itself is the PID 1 process according to Shepherd's own pidfile and control socket state.
What changed to make the real VM pass
The most important refinement after the first local PID 1 work was making the generated activation path more tolerant of immutable store-backed configuration files during very early boot.
Specifically, the generated activation script now treats these as best-effort:
cap_mkdb /etc/login.confpwd_mkdb -p /etc/master.passwd
That matters because on the PID 1 path they happen earlier and should not abort the system if the current /etc representation is not suitable for in-place database regeneration.
The Shepherd-PID-1 operating-system template was also expanded to keep the NIC configuration broad enough for both local virtio and the real Xen path:
ifconfig_xn0=SYNCDHCPifconfig_em0=SYNCDHCPifconfig_vtnet0=SYNCDHCP
Assessment
This is a stronger result than the earlier local-only prototype.
Fruix now has a real deployment-validated FreeBSD boot mode where:
- FreeBSD
init(8)hands off immediately viainit_exec - the generated Fruix launcher performs the minimal bootstrap
- Shepherd becomes PID 1
- networking and SSH still work on the real XCP-ng VM
- and the system still reaches the Fruix ready marker
That means the project has now validated both of these boot architectures on the real VM:
freebsd-init+rc.d-shepherdshepherd-pid1
Remaining limitations
This does not yet eliminate the current locally built Guile/Shepherd compatibility-prefix shims.
Those shims are still needed because the locally staged runtime artifacts continue to embed historical build prefixes. The current result proves that the broader init/boot-manager dependency can be removed, but it does not yet fully solve the store-native runtime-prefix problem.
Conclusion
The Shepherd-as-PID-1 Fruix boot mode now works not only under local QEMU/UEFI, but also on the real operator-approved XCP-ng VM.
This substantially strengthens the case that Fruix can move beyond the transitional rc.d bridge design and toward a more Guix-like PID-1-centered system architecture on FreeBSD.