5.9 KiB
Phase 1.1 follow-up: Guile subprocess crash on FreeBSD
Date: 2026-04-01
Summary
guile3 on this FreeBSD 15.0-STABLE amd64 host crashes when Guile tries to create subprocesses through:
system*spawnopen-pipe*
The crash is reproducible and is not caused by FreeBSD's native posix_spawn(3) implementation by itself. The evidence points to an upstream Guile/gnulib integration bug on FreeBSD:
- gnulib decides to replace
posix_spawn/posix_spawnpon this platform - Guile still calls the native FreeBSD extension
posix_spawn_file_actions_addclosefrom_np - that function receives a gnulib replacement
posix_spawn_file_actions_tobject with an incompatible ABI - the process crashes inside libc when
addclosefrom_npinterprets gnulib's struct header as a native pointer
Repro artifacts added
tests/guile/posix-spawn-freebsd-diagnostics.ctests/guile/run-subprocess-diagnostics.sh
Run with:
./tests/guile/run-subprocess-diagnostics.sh
Expected output on the current host includes:
native-spawn-closefrom=ok
adddup2-invalid-fd-accepted=yes
addopen-invalid-fd-accepted=yes
posix_spawn-secure-exec-result=0
posix_spawnp-secure-exec-result=3
issue-profile-match=yes
system-star exit=139
spawn exit=139
open-pipe-star exit=139
Minimal Guile reproducers
guile3 -c '(system* "/usr/bin/true")'
guile3 -c '(spawn "/usr/bin/true" (list "/usr/bin/true"))'
guile3 -c '(use-modules (ice-9 popen)) (open-pipe* OPEN_READ "/usr/bin/true")'
All three terminate with SIGSEGV (exit 139) on this machine.
Native FreeBSD posix_spawn is not the direct problem
A standalone C test using FreeBSD's native APIs works correctly:
posix_spawn_file_actions_initposix_spawn_file_actions_adddup2posix_spawn_file_actions_addclosefrom_npposix_spawn
The diagnostic program in tests/guile/posix-spawn-freebsd-diagnostics.c confirms this with:
native-spawn-closefrom=ok
So the crash is above libc, in how Guile/gnulib prepares the file-actions object.
Why gnulib replaces posix_spawn on this host
Upstream Guile 3.0.10 vendors gnulib logic in m4/posix_spawn.m4.
Two FreeBSD-relevant observations from the local diagnostics match gnulib's replacement logic:
posix_spawnpis considered insecure by gnulib's test because it accepts a script without a shebang and ends up running it successfully instead of rejecting it withENOEXEC.- FreeBSD's
posix_spawn_file_actions_adddup2andposix_spawn_file_actions_addopenaccept obviously invalid file descriptors in the gnulib probe cases, so gnulib also wants wrapper/replacement behavior there.
Observed locally:
adddup2-invalid-fd-accepted=yes
addopen-invalid-fd-accepted=yes
posix_spawnp-secure-exec-result=3
That strongly indicates REPLACE_POSIX_SPAWN=1 in the Guile build on this system.
Root cause hypothesis
1. Guile uses addclosefrom_np when the symbol exists
In upstream Guile 3.0.10, libguile/posix.c contains:
#ifdef HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCLOSEFROM_NP#define HAVE_ADDCLOSEFROM 1- later in
do_spawn(...):
#ifdef HAVE_ADDCLOSEFROM
posix_spawn_file_actions_addclosefrom_np (&actions, 3);
#else
close_inherited_fds (&actions, max_fd);
#endif
2. But gnulib can replace the posix_spawn ABI
In upstream gnulib's lib/spawn.in.h, when REPLACE_POSIX_SPAWN=1, posix_spawn_file_actions_t becomes a gnulib-defined struct instead of the native FreeBSD opaque-pointer type.
FreeBSD's native /usr/include/spawn.h defines:
typedef struct __posix_spawn_file_actions *posix_spawn_file_actions_t;
So native FreeBSD expects posix_spawn_file_actions_t to be pointer-like, while gnulib replacement mode uses an in-memory struct.
3. The crash signature matches that ABI mismatch exactly
The lldb backtrace from the core file shows the crash in:
libc.so.7`posix_spawn_file_actions_addclosefrom_np
with:
*fa = 0x0000000600000008
That value matches the first two 32-bit fields of gnulib's replacement file-actions struct interpreted as a pointer:
_allocated = 8_used = 6
Those values are exactly plausible after Guile schedules six dup2 actions in do_spawn(...).
In other words, libc is reading gnulib's struct header as though it were a native pointer to struct __posix_spawn_file_actions, which explains the segmentation fault.
Assessment
This looks like an upstream Guile bug on FreeBSD-family systems where:
- gnulib decides
REPLACE_POSIX_SPAWN=1, and - the platform exposes native
posix_spawn_file_actions_addclosefrom_np
It does not look like a Guix-specific bug, nor primarily a local packaging mistake.
Recommended fix direction
The safest fix is in Guile's libguile/posix.c:
- only use
posix_spawn_file_actions_addclosefrom_npwhen Guile is using the nativeposix_spawn/posix_spawn_file_actions_tABI - if gnulib replacement
posix_spawnis active, fall back toclose_inherited_fds(&actions, max_fd)instead
In practice that likely means guarding the HAVE_ADDCLOSEFROM path with an additional condition equivalent to:
#if defined(HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCLOSEFROM_NP) && !defined(REPLACE_POSIX_SPAWN)
or another build-time condition that guarantees ABI compatibility.
Impact on the Guix-on-FreeBSD port
This is an important blocker because Guix and Guile code frequently depend on subprocess creation helpers.
However, the investigation also confirms:
- lower-level process primitives still work (
primitive-fork,waitpid) - sockets, file I/O, and FFI still work
- the problem is narrow enough to patch or work around
So the Guix port remains viable, but robust subprocess handling on FreeBSD will likely require either:
- a local Guile patch, or
- an upstream fix to Guile/gnulib integration, or
- temporary Guix-side avoidance of the crashing subprocess helpers while bootstrapping the port