Prototype FreeBSD jail build isolation

2026-04-01 11:40:43 +02:00
parent 7a6ce6b983
commit 7621798ef5
3 changed files with 429 additions and 0 deletions
--- a/docs/PROGRESS.md
+++ b/docs/PROGRESS.md
@@ -833,3 +833,66 @@ Next recommended step:
 2. carry forward the current concrete runtime blocker from Phase 1.2:
  - investigate the `leave-on-EPIPE` failure in `./pre-inst-env guix --version`
 3. continue keeping `/frx/store` as the intended experimental store root and keep `~/repos/bdwgc` in reserve if later FreeBSD-specific GC/thread issues appear
+
+## 2026-04-01 — Phase 2.1 completed: jail-first build isolation design validated on FreeBSD
+
+Completed work:
+
+- added a runnable jail-based build isolation prototype:
+  - `tests/daemon/run-freebsd-jail-build-prototype.sh`
+- wrote the Phase 2.1 design/prototype report:
+  - `docs/reports/phase2-freebsd-jail-build-isolation.md`
+- translated the earlier Phase 1 syscall mapping into a concrete FreeBSD Guix-daemon isolation design centered on:
+  - thin jails
+  - one jail per build
+  - explicit `nullfs` mount plans
+  - networking disabled by default
+  - separate build-user credentials inside the jail envelope
+- ran the jail prototype successfully and captured metadata under:
+  - `/tmp/jail-build-metadata.txt`
+
+Important findings:
+
+- a thin-jail approach is the right match for Guix's declared-input model; thick jails would overexpose ambient host state and add unnecessary duplication
+- a per-build jail root assembled from explicit read-only `nullfs` mounts is a practical replacement for the Linux bind-mount + mount-namespace model in current Guix daemon code
+- a basic build operation can already be executed successfully inside a FreeBSD jail with a restricted filesystem view consisting only of:
+  - selected read-only host toolchain paths
+  - a read-only declared input directory
+  - a writable declared output directory
+  - a writable `/tmp`
+- a host sentinel file left outside the jail root is not visible inside the build environment, confirming the prototype is exercising real visibility restriction rather than a mere chroot-like shell wrapper
+- the prototype jail ran with:
+  - `ip4=disable`
+  - `ip6=disable`
+  which matches the intended default for hermetic builds
+
+Current assessment:
+
+- Phase 2.1 is now satisfied on the current FreeBSD prototype track
+- the main design decision is now concrete rather than speculative:
+  - the Guix FreeBSD daemon path should be jail-first, not Linux-namespace emulation
+- the next step is to add a build-user privilege-dropping prototype inside or alongside this jail model so the design covers both containment and user-level isolation
+
+Recent commits:
+
+- `e380e88` — `Add FreeBSD Guile verification harness`
+- `cd721b1` — `Update progress after Guile verification`
+- `27916cb` — `Diagnose Guile subprocess crash on FreeBSD`
+- `02f7a7f` — `Validate local Guile fix on FreeBSD`
+- `4aebea4` — `Add native GNU Hello FreeBSD build harness`
+- `c944cdb` — `Validate Guix builder phases on FreeBSD`
+- `0a2e48e` — `Validate GNU which builder phases on FreeBSD`
+- `245a47d` — `Document gaps to real Guix FreeBSD builds`
+- `d62e9b0` — `Investigate Guix derivation generation on FreeBSD`
+- `c0a85ed` — `Build local Guile-GnuTLS on FreeBSD`
+- `15b9037` — `Build local Guile-Git on FreeBSD`
+- `47d31e8` — `Build local Guile-JSON on FreeBSD`
+- `d82195b` — `Advance Guix checkout on FreeBSD`
+- `9bf3d30` — `Document FreeBSD syscall mapping`
+
+Next recommended step:
+
+1. implement the Phase 2.2 privilege-dropping/build-user prototype for FreeBSD, ideally combined with the new jail execution model
+2. then establish a `/frx/store`-based store-management prototype covering permissions, package readability, and garbage-collection behavior
+3. continue carrying the separate Guix checkout runtime blocker:
+  - investigate the `leave-on-EPIPE` failure in `./pre-inst-env guix --version`
--- a/docs/reports/phase2-freebsd-jail-build-isolation.md
+++ b/docs/reports/phase2-freebsd-jail-build-isolation.md
@@ -0,0 +1,191 @@
+# Phase 2.1: FreeBSD jail-based build isolation design and prototype
+
+Date: 2026-04-01
+
+## Summary
+
+This step turns the Phase 1 syscall/interface mapping into a concrete FreeBSD build-isolation design for Guix-daemon and validates the core idea with a runnable prototype.
+
+Added file:
+
+- `tests/daemon/run-freebsd-jail-build-prototype.sh`
+
+The prototype demonstrates a single build operation inside a FreeBSD jail with restricted filesystem visibility.  The jail only sees:
+
+- a read-only host toolchain slice mounted with `nullfs`
+- a read-only declared input directory
+- a writable declared output directory
+- a writable `/tmp`
+
+A host-side sentinel file intentionally left outside the jail root is confirmed to be invisible inside the build environment.
+
+## Design decisions
+
+### 1. Use thin jails, not thick jails, for per-build isolation
+
+Chosen model: **thin jail per build**.
+
+Reasoning:
+
+- Guix already conceptualizes builds as seeing a minimal set of declared inputs rather than an entire copied system image.
+- Thick jails would duplicate too much base-system state and blur the distinction between declared and ambient inputs.
+- Thin jails fit better with the Guix model if combined with explicit `nullfs` mounts for:
+  - declared store inputs
+  - declared build tools
+  - writable build/output directories
+
+Implication:
+
+- the jail root is a synthetic filesystem view assembled per build
+- the host base system remains outside the jail and is selectively re-exposed read-only where needed
+
+### 2. Replace Linux bind mounts and mount namespaces with `nullfs` mount plans
+
+Current Guix daemon code relies heavily on Linux mount namespace behavior and bind mounts.
+
+On FreeBSD, the closest practical replacement is:
+
+- create a per-build jail root directory
+- expose only required host paths using `mount_nullfs`
+- mount declared inputs read-only
+- mount writable build scratch and output paths explicitly
+- avoid reliance on `pivot_root`, `unshare`, or `setns`, which are absent on this host
+
+This is a different implementation strategy, but it preserves the key Guix property that build environments should only see explicitly declared filesystem inputs.
+
+### 3. Use one jail per build
+
+Chosen model: **one jail per build job**.
+
+Reasoning:
+
+- it provides the clearest conceptual mapping from “one derivation build” to “one isolated execution environment”
+- it avoids complex state bleed between builds
+- it aligns well with the later need to associate a specific build user, writable scratch directory, and mount plan with one job
+- it makes cleanup straightforward: remove the jail, unmount paths, collect temporary roots
+
+### 4. Disable networking by default
+
+Chosen model: **network disabled unless a build explicitly requires it**.
+
+Prototype settings:
+
+- `ip4=disable`
+- `ip6=disable`
+
+Reasoning:
+
+- this matches Guix expectations for hermetic builds more closely than inheriting host networking
+- if future fixed-output or fetch-like builds require networking, that should be an explicit opt-in policy decision
+- if stronger network virtualization is later needed, VNET jails are the natural FreeBSD-side extension point
+
+### 5. Keep build users separate from jail identity
+
+A jail is the isolation envelope; the build user remains the privilege identity inside that envelope.
+
+This means the eventual FreeBSD daemon design should combine:
+
+- **per-build jail** for filesystem/process/network scoping
+- **per-build user credentials** for ownership and write restrictions
+- **read-only store mounts** plus explicit writable scratch/output mounts
+
+This avoids conflating “container boundary” with “user identity”.
+
+## Prototype implementation
+
+Run command:
+
+```sh
+METADATA_OUT=/tmp/jail-build-metadata.txt \
+./tests/daemon/run-freebsd-jail-build-prototype.sh
+```
+
+What the prototype does:
+
+1. creates a temporary jail root
+2. creates a minimal read-only host toolchain view with `nullfs` mounts for:
+   - `/bin`
+   - `/lib`
+   - `/libexec`
+   - `/usr/bin`
+   - `/usr/include`
+   - `/usr/lib`
+   - `/usr/libdata`
+   - `/usr/libexec`
+3. mounts a declared input directory read-only at `/inputs`
+4. mounts a declared output directory read-write at `/output`
+5. starts a persistent jail with:
+   - `ip4=disable`
+   - `ip6=disable`
+6. verifies that a host sentinel file outside the jail root is not visible inside the jail
+7. compiles a small C program inside the jail
+8. runs the produced binary inside the jail
+9. runs the produced binary again on the host from the mounted output directory
+
+## Observed results
+
+Observed output:
+
+```text
+hello-from-freebsd-jail-build
+```
+
+Observed jail parameters included:
+
+- `enforce_statfs=2`
+- `ip4=disable`
+- `ip6=disable`
+- `persist`
+- several `allow.no*` restrictions by default
+
+Observed `nullfs` mount layout included:
+
+- read-only base/toolchain mounts under the jail root
+- read-only declared input mount
+- writable declared output mount
+
+Metadata was captured in:
+
+- `/tmp/jail-build-metadata.txt`
+
+## Mapping current Guix isolation features to FreeBSD
+
+| Current Guix/Linux-oriented concept | FreeBSD design choice |
+|---|---|
+| mount namespace per build | per-build jail root + explicit `nullfs` mount plan |
+| bind-mount declared inputs | `mount_nullfs` declared inputs into jail root |
+| `pivot_root` style root switch | not used; jail `path=` and explicit root layout instead |
+| network namespace isolation | `ip4=disable` / `ip6=disable` by default; VNET only if later required |
+| build scratch directory | writable jail-local `/tmp` and/or explicit writable work mounts |
+| concurrent isolated builds | one jail per build |
+| process isolation boundary | jail boundary plus later build-user credential drop |
+
+## Security implications compared to Linux Guix
+
+### Positive points
+
+- the jail boundary gives a strong coarse-grained isolation primitive
+- the filesystem view is explicit and easy to audit through mount tables
+- network disablement is straightforward for default hermetic builds
+- a per-build jail model composes naturally with separate build users
+
+### Important differences
+
+- Linux namespace-based code paths cannot be ported mechanically
+- the FreeBSD design is more configuration-oriented and less syscall-granular
+- fine-grained Linux capability and seccomp concepts do not directly carry over
+- jail setup is likely to remain a privileged daemon-side responsibility
+
+## Conclusion
+
+Phase 2.1 is satisfied on the current prototype track:
+
+- a concrete FreeBSD jail-first design exists for Guix build isolation
+- the design explicitly chooses:
+  - thin jails
+  - one jail per build
+  - `nullfs`-based declared-input exposure
+  - networking disabled by default
+- a runnable prototype successfully executed a basic build command inside a jail with restricted filesystem visibility
+
+This establishes the main Phase 2 architectural direction: FreeBSD support should be implemented as a jail-based daemon design, not as an attempted Linux-namespace emulation layer.
--- a/tests/daemon/run-freebsd-jail-build-prototype.sh
+++ b/tests/daemon/run-freebsd-jail-build-prototype.sh
@@ -0,0 +1,175 @@
+#!/bin/sh
+set -eu
+
+cc_bin=${CC_BIN:-/usr/bin/cc}
+cleanup_workdir=0
+jail_id=
+jail_name=
+mount_points=
+
+if [ -n "${WORKDIR:-}" ]; then
+  workdir=$WORKDIR
+  mkdir -p "$workdir"
+else
+  workdir=$(mktemp -d /tmp/fruix-jail-build-prototype.XXXXXX)
+  cleanup_workdir=1
+fi
+
+if [ "${KEEP_WORKDIR:-0}" -eq 1 ]; then
+  cleanup_workdir=0
+fi
+
+record_mount() {
+  mount_points="$1
+$mount_points"
+}
+
+cleanup() {
+  set +e
+  if [ -n "$jail_id" ]; then
+    sudo jail -r "$jail_id" >/dev/null 2>&1 || true
+  fi
+
+  old_ifs=$IFS
+  IFS='
+'
+  for mount_point in $mount_points; do
+    [ -n "$mount_point" ] || continue
+    sudo umount "$mount_point" >/dev/null 2>&1 || true
+  done
+  IFS=$old_ifs
+
+  if [ "$cleanup_workdir" -eq 1 ]; then
+    rm -rf "$workdir"
+  fi
+}
+trap cleanup EXIT INT TERM
+
+root=$workdir/jail-root
+inputs=$workdir/inputs
+output=$workdir/output
+metadata_file=$workdir/jail-build-metadata.txt
+mounts_file=$workdir/jail-mounts.txt
+runtime_out=$workdir/runtime.out
+runtime_err=$workdir/runtime.err
+host_runtime_out=$workdir/host-runtime.out
+compile_log=$workdir/compile.log
+compile_err=$workdir/compile.err
+jls_out=$workdir/jls.txt
+outside_sentinel=$workdir/outside-sentinel
+
+mkdir -p "$root/usr" "$root/tmp" "$inputs" "$output"
+chmod 1777 "$root/tmp"
+printf 'outside-visible-on-host-only\n' > "$outside_sentinel"
+
+cat > "$inputs/hello.c" <<'EOF'
+#include <stdio.h>
+int main(void)
+{
+  puts("hello-from-freebsd-jail-build");
+  return 0;
+}
+EOF
+
+readonly_mounts="/bin /lib /libexec /usr/bin /usr/include /usr/lib /usr/libdata /usr/libexec"
+for host_path in $readonly_mounts; do
+  target=$root$host_path
+  sudo mkdir -p "$target"
+  sudo mount_nullfs -o ro "$host_path" "$target"
+  record_mount "$target"
+done
+
+for target in /inputs /output; do
+  sudo mkdir -p "$root$target"
+done
+
+sudo mount_nullfs -o ro "$inputs" "$root/inputs"
+record_mount "$root/inputs"
+sudo mount_nullfs "$output" "$root/output"
+record_mount "$root/output"
+
+jail_name=fruix-jail-build-$$
+jail_id=$(sudo jail -i -c \
+  name="$jail_name" \
+  path="$root" \
+  host.hostname="$jail_name" \
+  persist \
+  ip4=disable \
+  ip6=disable)
+
+sudo jls -n -j "$jail_id" > "$jls_out"
+mount | grep "$root" > "$mounts_file"
+
+set +e
+sudo jexec "$jail_id" /bin/sh -eu -c '
+  test ! -e /outside-sentinel
+  /usr/bin/cc -Wall -Wextra /inputs/hello.c -o /output/hello
+' > "$compile_log" 2> "$compile_err"
+compile_rc=$?
+set -e
+
+if [ "$compile_rc" -ne 0 ]; then
+  echo "jail build prototype compile phase failed" >&2
+  cat "$compile_log" >&2 || true
+  cat "$compile_err" >&2 || true
+  exit 1
+fi
+
+set +e
+sudo jexec "$jail_id" /bin/sh -eu -c '/output/hello' > "$runtime_out" 2> "$runtime_err"
+runtime_rc=$?
+set -e
+
+if [ "$runtime_rc" -ne 0 ]; then
+  echo "jail build prototype runtime phase failed" >&2
+  cat "$runtime_out" >&2 || true
+  cat "$runtime_err" >&2 || true
+  exit 1
+fi
+
+"$output/hello" > "$host_runtime_out"
+
+cat > "$metadata_file" <<EOF
+workdir=$workdir
+jail_name=$jail_name
+jail_id=$jail_id
+jail_root=$root
+inputs_dir=$inputs
+output_dir=$output
+outside_sentinel=$outside_sentinel
+compile_log=$compile_log
+runtime_out=$runtime_out
+runtime_err=$runtime_err
+host_runtime_out=$host_runtime_out
+compile_err=$compile_err
+jls_out=$jls_out
+mounts_file=$mounts_file
+read_only_mounts=$readonly_mounts
+jail_compile_rc=$compile_rc
+jail_runtime_rc=$runtime_rc
+jail_runtime_output=$(tr '\n' '|' < "$runtime_out")
+host_runtime_output=$(tr '\n' '|' < "$host_runtime_out")
+outside_visibility_check=hidden-inside-jail
+EOF
+
+if [ -n "${METADATA_OUT:-}" ]; then
+  mkdir -p "$(dirname "$METADATA_OUT")"
+  cp "$metadata_file" "$METADATA_OUT"
+fi
+
+printf 'PASS freebsd-jail-build-prototype\n'
+printf 'Working directory: %s\n' "$workdir"
+printf 'Metadata file: %s\n' "$metadata_file"
+if [ -n "${METADATA_OUT:-}" ]; then
+  printf 'Copied metadata to: %s\n' "$METADATA_OUT"
+fi
+printf '%s\n' '--- jail runtime output ---'
+cat "$runtime_out"
+printf '%s\n' '--- host runtime output ---'
+cat "$host_runtime_out"
+printf '%s\n' '--- jail parameters ---'
+cat "$jls_out"
+printf '%s\n' '--- nullfs mounts ---'
+cat "$mounts_file"
+printf '%s\n' '--- metadata ---'
+cat "$metadata_file"