Files
fruix/docs/reports/phase2-freebsd-jail-build-isolation.md

192 lines
6.8 KiB
Markdown

# Phase 2.1: FreeBSD jail-based build isolation design and prototype
Date: 2026-04-01
## Summary
This step turns the Phase 1 syscall/interface mapping into a concrete FreeBSD build-isolation design for Guix-daemon and validates the core idea with a runnable prototype.
Added file:
- `tests/daemon/run-freebsd-jail-build-prototype.sh`
The prototype demonstrates a single build operation inside a FreeBSD jail with restricted filesystem visibility. The jail only sees:
- a read-only host toolchain slice mounted with `nullfs`
- a read-only declared input directory
- a writable declared output directory
- a writable `/tmp`
A host-side sentinel file intentionally left outside the jail root is confirmed to be invisible inside the build environment.
## Design decisions
### 1. Use thin jails, not thick jails, for per-build isolation
Chosen model: **thin jail per build**.
Reasoning:
- Guix already conceptualizes builds as seeing a minimal set of declared inputs rather than an entire copied system image.
- Thick jails would duplicate too much base-system state and blur the distinction between declared and ambient inputs.
- Thin jails fit better with the Guix model if combined with explicit `nullfs` mounts for:
- declared store inputs
- declared build tools
- writable build/output directories
Implication:
- the jail root is a synthetic filesystem view assembled per build
- the host base system remains outside the jail and is selectively re-exposed read-only where needed
### 2. Replace Linux bind mounts and mount namespaces with `nullfs` mount plans
Current Guix daemon code relies heavily on Linux mount namespace behavior and bind mounts.
On FreeBSD, the closest practical replacement is:
- create a per-build jail root directory
- expose only required host paths using `mount_nullfs`
- mount declared inputs read-only
- mount writable build scratch and output paths explicitly
- avoid reliance on `pivot_root`, `unshare`, or `setns`, which are absent on this host
This is a different implementation strategy, but it preserves the key Guix property that build environments should only see explicitly declared filesystem inputs.
### 3. Use one jail per build
Chosen model: **one jail per build job**.
Reasoning:
- it provides the clearest conceptual mapping from “one derivation build” to “one isolated execution environment”
- it avoids complex state bleed between builds
- it aligns well with the later need to associate a specific build user, writable scratch directory, and mount plan with one job
- it makes cleanup straightforward: remove the jail, unmount paths, collect temporary roots
### 4. Disable networking by default
Chosen model: **network disabled unless a build explicitly requires it**.
Prototype settings:
- `ip4=disable`
- `ip6=disable`
Reasoning:
- this matches Guix expectations for hermetic builds more closely than inheriting host networking
- if future fixed-output or fetch-like builds require networking, that should be an explicit opt-in policy decision
- if stronger network virtualization is later needed, VNET jails are the natural FreeBSD-side extension point
### 5. Keep build users separate from jail identity
A jail is the isolation envelope; the build user remains the privilege identity inside that envelope.
This means the eventual FreeBSD daemon design should combine:
- **per-build jail** for filesystem/process/network scoping
- **per-build user credentials** for ownership and write restrictions
- **read-only store mounts** plus explicit writable scratch/output mounts
This avoids conflating “container boundary” with “user identity”.
## Prototype implementation
Run command:
```sh
METADATA_OUT=/tmp/jail-build-metadata.txt \
./tests/daemon/run-freebsd-jail-build-prototype.sh
```
What the prototype does:
1. creates a temporary jail root
2. creates a minimal read-only host toolchain view with `nullfs` mounts for:
- `/bin`
- `/lib`
- `/libexec`
- `/usr/bin`
- `/usr/include`
- `/usr/lib`
- `/usr/libdata`
- `/usr/libexec`
3. mounts a declared input directory read-only at `/inputs`
4. mounts a declared output directory read-write at `/output`
5. starts a persistent jail with:
- `ip4=disable`
- `ip6=disable`
6. verifies that a host sentinel file outside the jail root is not visible inside the jail
7. compiles a small C program inside the jail
8. runs the produced binary inside the jail
9. runs the produced binary again on the host from the mounted output directory
## Observed results
Observed output:
```text
hello-from-freebsd-jail-build
```
Observed jail parameters included:
- `enforce_statfs=2`
- `ip4=disable`
- `ip6=disable`
- `persist`
- several `allow.no*` restrictions by default
Observed `nullfs` mount layout included:
- read-only base/toolchain mounts under the jail root
- read-only declared input mount
- writable declared output mount
Metadata was captured in:
- `/tmp/jail-build-metadata.txt`
## Mapping current Guix isolation features to FreeBSD
| Current Guix/Linux-oriented concept | FreeBSD design choice |
|---|---|
| mount namespace per build | per-build jail root + explicit `nullfs` mount plan |
| bind-mount declared inputs | `mount_nullfs` declared inputs into jail root |
| `pivot_root` style root switch | not used; jail `path=` and explicit root layout instead |
| network namespace isolation | `ip4=disable` / `ip6=disable` by default; VNET only if later required |
| build scratch directory | writable jail-local `/tmp` and/or explicit writable work mounts |
| concurrent isolated builds | one jail per build |
| process isolation boundary | jail boundary plus later build-user credential drop |
## Security implications compared to Linux Guix
### Positive points
- the jail boundary gives a strong coarse-grained isolation primitive
- the filesystem view is explicit and easy to audit through mount tables
- network disablement is straightforward for default hermetic builds
- a per-build jail model composes naturally with separate build users
### Important differences
- Linux namespace-based code paths cannot be ported mechanically
- the FreeBSD design is more configuration-oriented and less syscall-granular
- fine-grained Linux capability and seccomp concepts do not directly carry over
- jail setup is likely to remain a privileged daemon-side responsibility
## Conclusion
Phase 2.1 is satisfied on the current prototype track:
- a concrete FreeBSD jail-first design exists for Guix build isolation
- the design explicitly chooses:
- thin jails
- one jail per build
- `nullfs`-based declared-input exposure
- networking disabled by default
- a runnable prototype successfully executed a basic build command inside a jail with restricted filesystem visibility
This establishes the main Phase 2 architectural direction: FreeBSD support should be implemented as a jail-based daemon design, not as an attempted Linux-namespace emulation layer.