Validate Shepherd PID 1 boot on XCP-ng

This commit is contained in:
2026-04-02 13:44:45 +02:00
parent f5ffd111ee
commit 377a6e49ff
4 changed files with 388 additions and 0 deletions

View File

@@ -2486,3 +2486,58 @@ Next recommended step:
1. try the `shepherd-pid1` image on the real XCP-ng VM
2. if it boots there too, decide whether to keep `shepherd-pid1` as an experimental selectable boot mode or advance it further toward the main Fruix boot path
3. continue reducing the remaining Guile / Shepherd compatibility-prefix shims now that the broader `rc.d` boot-manager dependency has been locally bypassed
## 2026-04-02 — Post-Phase-10: Shepherd-as-PID-1 boot also passed on the real XCP-ng VM
Completed work:
- took the locally validated `shepherd-pid1` boot mode and tested it on the real XCP-ng deployment path
- wrote the follow-up report:
- `docs/reports/postphase10-shepherd-pid1-xcpng-freebsd.md`
- expanded the Shepherd-PID-1 operating-system template so the generated guest remains compatible with both local virtio and the real Xen NIC path:
- `tests/system/phase11-shepherd-pid1-operating-system.scm.in`
- now includes:
- `ifconfig_xn0=SYNCDHCP`
- `ifconfig_em0=SYNCDHCP`
- `ifconfig_vtnet0=SYNCDHCP`
- added a dedicated real-VM Shepherd-PID-1 deployment/validation harness:
- `tests/system/run-phase11-shepherd-pid1-xcpng.sh`
Validation:
- `tests/system/run-phase11-shepherd-pid1-xcpng.sh` now passes on the operator-approved VM and existing VDI:
- VM `90490f2e-e8fc-4b7a-388e-5c26f0157289`
- VDI `0f1f90d3-48ca-4fa2-91d8-fc6339b95743`
- passing run workdir:
- `/tmp/pid1-xcpng-1775129768`
- passing real-guest metadata confirmed:
- `ready_marker=ready`
- `run_current_system_target=/frx/store/2940c952e9d35e47f98fe62f296be2b6ab4fceb3eee8248d6a7823decd42a305-fruix-system-fruix-freebsd`
- `pid1_command=[guile]`
- `shepherd_pid=1`
- `shepherd_socket=present`
- `shepherd_status=running`
- `sshd_status=running`
- `init_mode=shepherd-pid1`
Important findings:
- the local QEMU PID 1 prototype was not a simulator-only artifact; the same general boot design also works on the real XCP-ng/Xen guest
- as expected for a Guile-script entry point, the PID 1 process image shows up as Guile, but the meaningful architectural check is that:
- `/var/run/shepherd.pid` contains `1`
- this means Fruix has now validated two distinct real-VM boot architectures on FreeBSD:
- `freebsd-init+rc.d-shepherd`
- `shepherd-pid1`
- however, this still does not remove the current Guile / Shepherd compatibility-prefix shims; those remain a separate runtime-artifact issue rather than an init-manager issue
Current assessment:
- Shepherd-as-PID-1 is now no longer merely a local prototype; it is validated on the real XCP-ng VM as well
- this significantly strengthens the path toward a more Guix-like Fruix system architecture on FreeBSD
- the main remaining native-runtime gap is now the baked-prefix / compatibility-shim problem, not whether Fruix can boot with Shepherd as PID 1
Next recommended step:
1. focus directly on eliminating the remaining Guile / Shepherd compatibility-prefix shims from the guest runtime
2. preserve `shepherd-pid1` as an experimental selectable boot mode while that cleanup proceeds
3. once the runtime-prefix issue is reduced, reassess whether `shepherd-pid1` should replace the older `freebsd-init+rc.d-shepherd` path as the preferred Fruix boot architecture

View File

@@ -0,0 +1,114 @@
# Post-Phase-10: Shepherd-as-PID-1 boot validated on the real XCP-ng FreeBSD VM
Date: 2026-04-02
## Goal
Take the locally validated Shepherd-as-PID-1 Fruix boot prototype and test it on the real operator-approved XCP-ng VM.
Target objects remained the same constrained deployment path used for Phase 9:
- VM: `90490f2e-e8fc-4b7a-388e-5c26f0157289`
- VDI: `0f1f90d3-48ca-4fa2-91d8-fc6339b95743`
The concrete goal for this subphase was to confirm that the new `shepherd-pid1` init mode was not merely a local QEMU curiosity, but could also:
- boot on the real Xen guest,
- reach DHCP and SSH,
- keep Shepherd running as PID 1,
- and still reach the Fruix ready marker.
## Result
The real XCP-ng boot succeeded.
A new deployment/validation harness was added:
- `tests/system/run-phase11-shepherd-pid1-xcpng.sh`
This harness reuses the existing real-VM deployment method:
- build a full-size image matching the existing VDI
- convert it to dynamic VHD
- overwrite the existing VDI
- boot the real VM
- rediscover the guest by MAC/IP
- validate the booted guest over SSH
The new Shepherd-PID-1 image passes that full path.
## Validation
Passing real-VM run:
- `PASS phase11-shepherd-pid1-xcpng`
- workdir: `/tmp/pid1-xcpng-1775129768`
Validated metadata from the real guest:
```text
ready_marker=ready
run_current_system_target=/frx/store/2940c952e9d35e47f98fe62f296be2b6ab4fceb3eee8248d6a7823decd42a305-fruix-system-fruix-freebsd
pid1_command=[guile]
shepherd_pid=1
shepherd_socket=present
shepherd_status=running
sshd_status=running
guest_ip=192.168.213.62
boot_backend=xcp-ng-xo-cli
init_mode=shepherd-pid1
```
The key architectural confirmation is:
- `shepherd_pid=1`
That shows the running Shepherd instance in the real guest is PID 1.
As in the local QEMU prototype, the process image is Guile because Shepherd is launched as a Guile script; however, the service manager itself is the PID 1 process according to Shepherd's own pidfile and control socket state.
## What changed to make the real VM pass
The most important refinement after the first local PID 1 work was making the generated activation path more tolerant of immutable store-backed configuration files during very early boot.
Specifically, the generated activation script now treats these as best-effort:
- `cap_mkdb /etc/login.conf`
- `pwd_mkdb -p /etc/master.passwd`
That matters because on the PID 1 path they happen earlier and should not abort the system if the current `/etc` representation is not suitable for in-place database regeneration.
The Shepherd-PID-1 operating-system template was also expanded to keep the NIC configuration broad enough for both local virtio and the real Xen path:
- `ifconfig_xn0=SYNCDHCP`
- `ifconfig_em0=SYNCDHCP`
- `ifconfig_vtnet0=SYNCDHCP`
## Assessment
This is a stronger result than the earlier local-only prototype.
Fruix now has a real deployment-validated FreeBSD boot mode where:
- FreeBSD `init(8)` hands off immediately via `init_exec`
- the generated Fruix launcher performs the minimal bootstrap
- Shepherd becomes PID 1
- networking and SSH still work on the real XCP-ng VM
- and the system still reaches the Fruix ready marker
That means the project has now validated both of these boot architectures on the real VM:
1. `freebsd-init+rc.d-shepherd`
2. `shepherd-pid1`
## Remaining limitations
This does not yet eliminate the current locally built Guile/Shepherd compatibility-prefix shims.
Those shims are still needed because the locally staged runtime artifacts continue to embed historical build prefixes. The current result proves that the broader init/boot-manager dependency can be removed, but it does not yet fully solve the store-native runtime-prefix problem.
## Conclusion
The Shepherd-as-PID-1 Fruix boot mode now works not only under local QEMU/UEFI, but also on the real operator-approved XCP-ng VM.
This substantially strengthens the case that Fruix can move beyond the transitional `rc.d` bridge design and toward a more Guix-like PID-1-centered system architecture on FreeBSD.

View File

@@ -70,6 +70,8 @@
("hostid_enable" . "NO")
("sendmail_enable" . "NONE")
("sshd_enable" . "YES")
("ifconfig_xn0" . "SYNCDHCP")
("ifconfig_em0" . "SYNCDHCP")
("ifconfig_vtnet0" . "SYNCDHCP"))
#:init-mode 'shepherd-pid1
#:ready-marker "/var/lib/fruix/ready"

View File

@@ -0,0 +1,217 @@
#!/bin/sh
set -eu
repo_root=$(CDPATH= cd -- "$(dirname "$0")/../.." && pwd)
vm_id=90490f2e-e8fc-4b7a-388e-5c26f0157289
os_template=${OS_TEMPLATE:-$repo_root/tests/system/phase11-shepherd-pid1-operating-system.scm.in}
system_name=${SYSTEM_NAME:-phase11-operating-system}
metadata_target=${METADATA_OUT:-}
root_authorized_key_file=${ROOT_AUTHORIZED_KEY_FILE:-$HOME/.ssh/id_ed25519.pub}
root_ssh_private_key_file=${ROOT_SSH_PRIVATE_KEY_FILE:-$HOME/.ssh/id_ed25519}
requested_disk_capacity=${DISK_CAPACITY:-}
cleanup=0
if [ -n "${WORKDIR:-}" ]; then
workdir=$WORKDIR
mkdir -p "$workdir"
else
workdir=$(mktemp -d /tmp/fruix-phase11-xcpng.XXXXXX)
cleanup=1
fi
if [ "${KEEP_WORKDIR:-0}" -eq 1 ]; then
cleanup=0
fi
phase11_os_file=$workdir/phase11-shepherd-pid1-operating-system.scm
phase8_log=$workdir/phase8-system-image.log
phase8_metadata=$workdir/phase8-system-image-metadata.txt
arp_scan_log=$workdir/arp-scan.log
ssh_stdout=$workdir/ssh.out
ssh_stderr=$workdir/ssh.err
metadata_file=$workdir/phase11-shepherd-pid1-xcpng-metadata.txt
vdi_info_json=$workdir/vdi-info.json
vm_info_json=$workdir/vm-info.json
upload_image=$workdir/disk.vhd
cleanup_workdir() {
if [ "$cleanup" -eq 1 ]; then
rm -rf "$workdir"
fi
}
trap cleanup_workdir EXIT INT TERM
[ -f "$root_authorized_key_file" ] || {
echo "missing root authorized key file: $root_authorized_key_file" >&2
exit 1
}
[ -f "$root_ssh_private_key_file" ] || {
echo "missing root SSH private key file: $root_ssh_private_key_file" >&2
exit 1
}
root_authorized_key=$(tr -d '\n' < "$root_authorized_key_file")
xo-cli list-objects id=$vm_id >"$vm_info_json"
vdi_id=$(xo-cli list-objects type=VBD | jq -r '.[] | select(.VM=="'$vm_id'" and .is_cd_drive==false and .position=="0") | .VDI' | head -n 1)
[ -n "$vdi_id" ] || { echo "failed to discover target VDI for VM $vm_id" >&2; exit 1; }
xo-cli list-objects type=VDI | jq '[.[] | select(.id=="'$vdi_id'")]' >"$vdi_info_json"
vdi_size=$(jq -r '.[0].size' "$vdi_info_json")
[ -n "$vdi_size" ] || { echo "failed to discover VDI size for $vdi_id" >&2; exit 1; }
if [ -n "$requested_disk_capacity" ] && [ "$requested_disk_capacity" != "$vdi_size" ]; then
echo "existing XCP-ng import path requires an image that matches the target VDI size; use DISK_CAPACITY=$vdi_size or leave it unset" >&2
exit 1
fi
disk_capacity=$vdi_size
requested_disk_bytes=$vdi_size
sed "s|__ROOT_AUTHORIZED_KEY__|$root_authorized_key|g" "$os_template" > "$phase11_os_file"
KEEP_WORKDIR=1 WORKDIR=$workdir/phase8-build OS_FILE=$phase11_os_file SYSTEM_NAME=$system_name DISK_CAPACITY=$disk_capacity \
METADATA_OUT=$phase8_metadata "$repo_root/tests/system/run-phase8-system-image.sh" \
>"$phase8_log" 2>&1
disk_image=$(sed -n 's/^disk_image=//p' "$phase8_metadata")
closure_path=$(sed -n 's/^closure_path=//p' "$phase8_metadata")
closure_base=$(basename "$closure_path")
raw_sha256=$(sed -n 's/^raw_sha256=//p' "$phase8_metadata")
image_store_path=$(sed -n 's/^image_store_path=//p' "$phase8_metadata")
command -v qemu-img >/dev/null 2>&1 || {
echo "qemu-img is required to convert the raw Fruix image to XCP-ng-compatible VHD" >&2
exit 1
}
qemu-img convert -f raw -O vpc -o subformat=dynamic,force_size=on "$disk_image" "$upload_image"
upload_sha256=$(sha256 -q "$upload_image")
upload_size_bytes=$(stat -f '%z' "$upload_image")
xo-cli vm.stop id=$vm_id force=true >/dev/null 2>&1 || true
xo-cli disk.importContent id=$vdi_id @=$upload_image >"$workdir/disk-import.out"
xo-cli vm.setBootOrder vm=$vm_id order=dcn >"$workdir/set-boot-order.out"
xo-cli vm.start id=$vm_id >"$workdir/vm-start.out"
vm_mac=$(jq -r '.[0].VIFs[0]' "$vm_info_json")
if [ -n "$vm_mac" ] && [ "$vm_mac" != null ]; then
vm_mac=$(xo-cli list-objects type=VIF | jq -r '.[] | select(.id=="'$vm_mac'") | .MAC' | tr 'A-Z' 'a-z')
else
vm_mac=
fi
host_interface=$(route -n get default | awk '/interface:/{print $2; exit}')
host_ip=$(ifconfig "$host_interface" | awk '/inet /{print $2; exit}')
subnet_prefix=${host_ip%.*}
ssh_guest() {
ssh -i "$root_ssh_private_key_file" \
-o BatchMode=yes \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
-o ConnectTimeout=5 \
root@"$guest_ip" "$@"
}
guest_ip=
for attempt in $(jot 90 1 90); do
: >"$arp_scan_log"
for host in $(jot 254 1 254); do
ip=$subnet_prefix.$host
(
ping -c 1 -W 1000 "$ip" >/dev/null 2>&1 && echo "$ip" >>"$arp_scan_log"
) &
done
wait
if [ -n "$vm_mac" ]; then
guest_ip=$(arp -an | awk -v mac="$vm_mac" 'tolower($4)==mac {gsub(/[()]/,"",$2); print $2; exit}')
fi
if [ -n "$guest_ip" ]; then
if ssh -i "$root_ssh_private_key_file" \
-o BatchMode=yes \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
-o ConnectTimeout=3 \
root@"$guest_ip" 'test -f /var/lib/fruix/ready' >"$ssh_stdout" 2>"$ssh_stderr"; then
break
fi
fi
sleep 5
done
[ -n "$guest_ip" ] || {
echo "guest IP was not discovered; manual console inspection is likely required" >&2
exit 1
}
ready_marker=$(ssh_guest 'cat /var/lib/fruix/ready')
run_current_system_target=$(ssh_guest 'readlink /run/current-system')
pid1_command=$(ssh_guest 'ps -p 1 -o command= | sed "s/^ *//"')
shepherd_pid=$(ssh_guest 'cat /var/run/shepherd.pid')
shepherd_socket=$(ssh_guest 'test -S /var/run/shepherd.sock && echo present || echo missing')
shepherd_status=$(ssh_guest 'test -f /var/run/shepherd.pid && kill -0 "$(cat /var/run/shepherd.pid)" >/dev/null 2>&1 && echo running || echo stopped')
logger_log=$(ssh_guest 'cat /var/log/fruix-shepherd.log' | tr '\n' ' ')
sshd_status=$(ssh_guest 'service sshd onestatus >/dev/null 2>&1 && echo running || echo stopped')
uname_output=$(ssh_guest 'uname -sr')
operator_home_listing=$(ssh_guest 'ls -d /home/operator')
activate_preview=$(ssh_guest 'head -n 5 /run/current-system/activate' | tr '\n' ' ')
[ "$ready_marker" = ready ] || { echo "unexpected ready marker contents: $ready_marker" >&2; exit 1; }
[ "$shepherd_pid" = 1 ] || { echo "shepherd is not PID 1: pid=$shepherd_pid command=$pid1_command" >&2; exit 1; }
[ "$shepherd_socket" = present ] || { echo "shepherd socket is missing" >&2; exit 1; }
[ "$shepherd_status" = running ] || { echo "shepherd is not running" >&2; exit 1; }
[ "$sshd_status" = running ] || { echo "sshd is not running" >&2; exit 1; }
[ "$run_current_system_target" = "/frx/store/$closure_base" ] || {
echo "unexpected /run/current-system target in guest: $run_current_system_target" >&2
exit 1
}
[ "$operator_home_listing" = /home/operator ] || { echo "operator home missing" >&2; exit 1; }
cat >"$metadata_file" <<EOF
workdir=$workdir
vm_id=$vm_id
vdi_id=$vdi_id
vdi_size=$vdi_size
disk_capacity=$disk_capacity
requested_disk_capacity=${requested_disk_capacity:-<auto>}
requested_disk_bytes=$requested_disk_bytes
phase11_os_file=$phase11_os_file
phase8_log=$phase8_log
phase8_metadata=$phase8_metadata
image_store_path=$image_store_path
disk_image=$disk_image
upload_image=$upload_image
upload_format=vhd-dynamic
upload_sha256=$upload_sha256
upload_size_bytes=$upload_size_bytes
closure_path=$closure_path
closure_base=$closure_base
raw_sha256=$raw_sha256
guest_ip=$guest_ip
vm_mac=$vm_mac
ready_marker=$ready_marker
run_current_system_target=$run_current_system_target
pid1_command=$pid1_command
shepherd_pid=$shepherd_pid
shepherd_socket=$shepherd_socket
shepherd_status=$shepherd_status
sshd_status=$sshd_status
logger_log=$logger_log
uname_output=$uname_output
operator_home_listing=$operator_home_listing
activate_preview=$activate_preview
boot_backend=xcp-ng-xo-cli
init_mode=shepherd-pid1
operator_access=ssh-root-key
root_authorized_key_file=$root_authorized_key_file
root_ssh_private_key_file=$root_ssh_private_key_file
EOF
if [ -n "$metadata_target" ]; then
mkdir -p "$(dirname "$metadata_target")"
cp "$metadata_file" "$metadata_target"
fi
printf 'PASS phase11-shepherd-pid1-xcpng\n'
printf 'Work directory: %s\n' "$workdir"
printf 'Metadata file: %s\n' "$metadata_file"
if [ -n "$metadata_target" ]; then
printf 'Copied metadata to: %s\n' "$metadata_target"
fi
printf '%s\n' '--- metadata ---'
cat "$metadata_file"