183 lines
5.9 KiB
Markdown
183 lines
5.9 KiB
Markdown
# Phase 1.1 follow-up: Guile subprocess crash on FreeBSD
|
|
|
|
Date: 2026-04-01
|
|
|
|
## Summary
|
|
|
|
`guile3` on this `FreeBSD 15.0-STABLE` amd64 host crashes when Guile tries to create subprocesses through:
|
|
|
|
- `system*`
|
|
- `spawn`
|
|
- `open-pipe*`
|
|
|
|
The crash is reproducible and is **not** caused by FreeBSD's native `posix_spawn(3)` implementation by itself. The evidence points to an **upstream Guile/gnulib integration bug on FreeBSD**:
|
|
|
|
- gnulib decides to replace `posix_spawn`/`posix_spawnp` on this platform
|
|
- Guile still calls the native FreeBSD extension `posix_spawn_file_actions_addclosefrom_np`
|
|
- that function receives a gnulib replacement `posix_spawn_file_actions_t` object with an incompatible ABI
|
|
- the process crashes inside libc when `addclosefrom_np` interprets gnulib's struct header as a native pointer
|
|
|
|
## Repro artifacts added
|
|
|
|
- `tests/guile/posix-spawn-freebsd-diagnostics.c`
|
|
- `tests/guile/run-subprocess-diagnostics.sh`
|
|
|
|
Run with:
|
|
|
|
```sh
|
|
./tests/guile/run-subprocess-diagnostics.sh
|
|
```
|
|
|
|
Expected output on the current host includes:
|
|
|
|
```text
|
|
native-spawn-closefrom=ok
|
|
adddup2-invalid-fd-accepted=yes
|
|
addopen-invalid-fd-accepted=yes
|
|
posix_spawn-secure-exec-result=0
|
|
posix_spawnp-secure-exec-result=3
|
|
issue-profile-match=yes
|
|
system-star exit=139
|
|
spawn exit=139
|
|
open-pipe-star exit=139
|
|
```
|
|
|
|
## Minimal Guile reproducers
|
|
|
|
```sh
|
|
guile3 -c '(system* "/usr/bin/true")'
|
|
guile3 -c '(spawn "/usr/bin/true" (list "/usr/bin/true"))'
|
|
guile3 -c '(use-modules (ice-9 popen)) (open-pipe* OPEN_READ "/usr/bin/true")'
|
|
```
|
|
|
|
All three terminate with `SIGSEGV` (`exit 139`) on this machine.
|
|
|
|
## Native FreeBSD `posix_spawn` is not the direct problem
|
|
|
|
A standalone C test using FreeBSD's native APIs works correctly:
|
|
|
|
- `posix_spawn_file_actions_init`
|
|
- `posix_spawn_file_actions_adddup2`
|
|
- `posix_spawn_file_actions_addclosefrom_np`
|
|
- `posix_spawn`
|
|
|
|
The diagnostic program in `tests/guile/posix-spawn-freebsd-diagnostics.c` confirms this with:
|
|
|
|
```text
|
|
native-spawn-closefrom=ok
|
|
```
|
|
|
|
So the crash is above libc, in how Guile/gnulib prepares the file-actions object.
|
|
|
|
## Why gnulib replaces `posix_spawn` on this host
|
|
|
|
Upstream Guile 3.0.10 vendors gnulib logic in `m4/posix_spawn.m4`.
|
|
|
|
Two FreeBSD-relevant observations from the local diagnostics match gnulib's replacement logic:
|
|
|
|
1. `posix_spawnp` is considered insecure by gnulib's test because it accepts a script without a shebang and ends up running it successfully instead of rejecting it with `ENOEXEC`.
|
|
2. FreeBSD's `posix_spawn_file_actions_adddup2` and `posix_spawn_file_actions_addopen` accept obviously invalid file descriptors in the gnulib probe cases, so gnulib also wants wrapper/replacement behavior there.
|
|
|
|
Observed locally:
|
|
|
|
```text
|
|
adddup2-invalid-fd-accepted=yes
|
|
addopen-invalid-fd-accepted=yes
|
|
posix_spawnp-secure-exec-result=3
|
|
```
|
|
|
|
That strongly indicates `REPLACE_POSIX_SPAWN=1` in the Guile build on this system.
|
|
|
|
## Root cause hypothesis
|
|
|
|
### 1. Guile uses `addclosefrom_np` when the symbol exists
|
|
|
|
In upstream Guile 3.0.10, `libguile/posix.c` contains:
|
|
|
|
- `#ifdef HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCLOSEFROM_NP`
|
|
- `#define HAVE_ADDCLOSEFROM 1`
|
|
- later in `do_spawn(...)`:
|
|
|
|
```c
|
|
#ifdef HAVE_ADDCLOSEFROM
|
|
posix_spawn_file_actions_addclosefrom_np (&actions, 3);
|
|
#else
|
|
close_inherited_fds (&actions, max_fd);
|
|
#endif
|
|
```
|
|
|
|
### 2. But gnulib can replace the `posix_spawn` ABI
|
|
|
|
In upstream gnulib's `lib/spawn.in.h`, when `REPLACE_POSIX_SPAWN=1`, `posix_spawn_file_actions_t` becomes a gnulib-defined struct instead of the native FreeBSD opaque-pointer type.
|
|
|
|
FreeBSD's native `/usr/include/spawn.h` defines:
|
|
|
|
```c
|
|
typedef struct __posix_spawn_file_actions *posix_spawn_file_actions_t;
|
|
```
|
|
|
|
So native FreeBSD expects `posix_spawn_file_actions_t` to be pointer-like, while gnulib replacement mode uses an in-memory struct.
|
|
|
|
### 3. The crash signature matches that ABI mismatch exactly
|
|
|
|
The lldb backtrace from the core file shows the crash in:
|
|
|
|
```text
|
|
libc.so.7`posix_spawn_file_actions_addclosefrom_np
|
|
```
|
|
|
|
with:
|
|
|
|
```text
|
|
*fa = 0x0000000600000008
|
|
```
|
|
|
|
That value matches the first two 32-bit fields of gnulib's replacement file-actions struct interpreted as a pointer:
|
|
|
|
- `_allocated = 8`
|
|
- `_used = 6`
|
|
|
|
Those values are exactly plausible after Guile schedules six `dup2` actions in `do_spawn(...)`.
|
|
|
|
In other words, libc is reading gnulib's struct header as though it were a native pointer to `struct __posix_spawn_file_actions`, which explains the segmentation fault.
|
|
|
|
## Assessment
|
|
|
|
This looks like an **upstream Guile bug on FreeBSD-family systems where**:
|
|
|
|
- gnulib decides `REPLACE_POSIX_SPAWN=1`, **and**
|
|
- the platform exposes native `posix_spawn_file_actions_addclosefrom_np`
|
|
|
|
It does **not** look like a Guix-specific bug, nor primarily a local packaging mistake.
|
|
|
|
## Recommended fix direction
|
|
|
|
The safest fix is in Guile's `libguile/posix.c`:
|
|
|
|
- only use `posix_spawn_file_actions_addclosefrom_np` when Guile is using the **native** `posix_spawn` / `posix_spawn_file_actions_t` ABI
|
|
- if gnulib replacement `posix_spawn` is active, fall back to `close_inherited_fds(&actions, max_fd)` instead
|
|
|
|
In practice that likely means guarding the `HAVE_ADDCLOSEFROM` path with an additional condition equivalent to:
|
|
|
|
```c
|
|
#if defined(HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCLOSEFROM_NP) && !defined(REPLACE_POSIX_SPAWN)
|
|
```
|
|
|
|
or another build-time condition that guarantees ABI compatibility.
|
|
|
|
## Impact on the Guix-on-FreeBSD port
|
|
|
|
This is an important blocker because Guix and Guile code frequently depend on subprocess creation helpers.
|
|
|
|
However, the investigation also confirms:
|
|
|
|
- lower-level process primitives still work (`primitive-fork`, `waitpid`)
|
|
- sockets, file I/O, and FFI still work
|
|
- the problem is narrow enough to patch or work around
|
|
|
|
So the Guix port remains viable, but robust subprocess handling on FreeBSD will likely require either:
|
|
|
|
1. a local Guile patch, or
|
|
2. an upstream fix to Guile/gnulib integration, or
|
|
3. temporary Guix-side avoidance of the crashing subprocess helpers while bootstrapping the port
|