6.9 KiB
Phase 9 checkpoint: XCP-ng boot reached DHCP and SSH on FreeBSD
Date: 2026-04-02
Goal
Advance Phase 9 from a static image-generation milestone to a real booted Fruix guest on the active FreeBSD/XCP-ng track, using the operator-approved VM:
- VM:
90490f2e-e8fc-4b7a-388e-5c26f0157289 - existing target VDI:
0f1f90d3-48ca-4fa2-91d8-fc6339b95743
The immediate objective for this checkpoint was narrower than full Phase 9 completion:
- boot the generated image under XCP-ng,
- obtain DHCP,
- and reach SSH access with the injected root key.
Summary
This checkpoint succeeded.
The current Fruix FreeBSD image now:
- boots on the target XCP-ng VM,
- mounts the generated root filesystem,
- completes enough of FreeBSD
rcstartup to configure networking, - obtains a DHCP lease on the Xen NIC,
- starts
sshd, - and accepts root public-key authentication over the network.
Validated guest details from the successful XCP-ng boot:
- guest IP:
192.168.213.62 - hostname:
fruix-freebsd - kernel string:
FreeBSD 15.0-STABLE stable/15-n282801-29dce45d8c50 GENERIC amd64
Representative successful SSH validation output:
FreeBSD fruix-freebsd 15.0-STABLE FreeBSD 15.0-STABLE stable/15-n282801-29dce45d8c50 GENERIC amd64
fruix-freebsd
192.168.213.62
Successful XCP-ng work directory:
/tmp/phase9-xcpng-ssh-1775097470
Important boot/debugging findings
The first decisive breakthrough came from running the generated image locally under QEMU/TCG with serial capture. That made the previously opaque early-boot failure visible.
1. The original early boot abort was not an XCP-ng image-format problem anymore
After the earlier switch from raw uploads to dynamic VHD uploads, the remaining boot failure was inside the guest boot process, not in the XO import path.
2. FreeBSD fstab handling for pseudo-filesystems was wrong
The serial log showed that boot aborted during filesystem checks because the generated fstab gave non-zero fsck fields to non-UFS mounts such as devfs.
Representative failure:
Starting file system checks:
/dev/gpt/fruix-root: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/gpt/fruix-root: clean, ...
fsck: exec fsck_devfs for devfs in /sbin:/usr/sbin: No such file or directory
Unknown error 1; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
The fix was to generate fsck pass fields only for UFS entries and emit 0 0 for pseudo-filesystems.
3. The minimal image was still missing many base files and commands expected by rc
Once rc ran further, QEMU serial logs exposed a long tail of missing runtime pieces that had not been visible from the earlier static validations alone.
Examples included:
- missing base commands:
ddexprrmdirsortmktempegrepfsynckldloadkldstatdevfsdevctlnewsyslogip6addrctl
- missing base config files:
/etc/network.subr/etc/devd.conf/etc/newsyslog.conf/etc/syslog.conf
- missing runtime directories:
/var/db/var/cron
- missing libraries needed by later boot helpers:
libgeom.so.5libdevctl.so.5libcap_net.so.1- C++ runtime pieces used by
devd
These were staged into the current FreeBSD package layer and linked into the generated rootfs.
4. SSH auth initially failed because the image relied on PAM without a complete PAM runtime/configuration
sshd would start, but root public-key authentication still failed. A direct in-guest debug run showed:
PAM: initialisation failed
For the minimal Phase 9 guest, the practical fix was to make the generated sshd_config use:
UsePAM no
while still keeping key-only login enabled.
That was sufficient to unlock real SSH access on both the local QEMU debug guest and the XCP-ng guest.
Current code-level outcomes
The current checkpoint work materially expanded the minimal FreeBSD runtime staged into Fruix images.
Highlights:
modules/fruix/packages/freebsd.scm- added dedicated runtime packages for:
freebsd-networkingfreebsd-openssh
- expanded staged base runtime coverage substantially for
rc, networking, and SSH - added required config files and shared libraries used during real boot
- added dedicated runtime packages for:
modules/fruix/system/freebsd.scm- added root authorized-key support to the operating-system model
- generated static account databases and supporting files:
/etc/passwd/etc/master.passwd/etc/group/etc/login.conf/etc/ttys
- activation now runs:
cap_mkdbpwd_mkdb
- activation creates required directories and SSH host keys
- generated
sshd_confignow disables PAM for the current minimal key-only Phase 9 path fstabgeneration now avoids fsck pass numbers for pseudo-filesystems- rootfs generation now links the additional
/etcfiles needed by real boot
tests/system/phase9-minimal-operating-system.scm.in- enables DHCP on the relevant NIC names for the current tracks:
xn0em0vtnet0
- injects the root authorized key
- includes the SSH/network runtime packages and required system users/groups
- enables DHCP on the relevant NIC names for the current tracks:
tests/system/run-phase8-system-image.sh- now accepts
OS_FILE - now accepts/passes
DISK_CAPACITY - serial-console validation was relaxed from an exact loader string to a
comconsolepresence check
- now accepts
Verified current state
The current validated Phase 9 state is:
- XCP-ng VHD upload path works against the existing VDI
- the guest boots far enough for normal
rcnetworking andsshd - DHCP works on the Xen NIC
- SSH key injection works
- root login over SSH works
This means the project has crossed an important Phase 9 boundary:
- the first boot validation no longer depends on local bhyve serial automation,
- and the real XCP-ng target can now be exercised over the network.
Remaining blocker
Phase 9 is not complete yet because the Fruix-specific readiness path still fails.
Current remaining blocker:
- Guile still crashes in the guest
- therefore
fruix-shepherddoes not start - therefore
/var/lib/fruix/readyis still absent
Representative guest evidence:
pid 262 (guile), jid 0, uid 0: exited on signal 11 (core dumped)
Over SSH on the real XCP-ng guest:
sshdis running- DHCP is active
fruix-shepherdis stopped/var/lib/fruix/readyis missing
A retrieved core dump and local lldb analysis show the Guile crash occurs extremely early during initialization, in the locale/string conversion path while building Guile load/build info. This remains the next debugging target.
Assessment
This checkpoint satisfies a meaningful Phase 9 intermediate milestone on the active FreeBSD/XCP-ng track:
- the generated Fruix image now boots as a network-reachable FreeBSD guest,
- and minimal operator access via SSH is working.
However, the full Fruix boot milestone is still blocked by in-guest Guile/Shepherd failure, so the overall Phase 9 milestone remains open.