docs: Study Khatru

This commit is contained in:
2026-03-16 16:53:55 +01:00
parent 186d0f98ee
commit 14fb0f7ffb
3 changed files with 182 additions and 1 deletions

140
docs/KHATRU.md Normal file
View File

@@ -0,0 +1,140 @@
# Khatru-Inspired Runtime Improvements
This document collects refactoring and extension ideas learned from studying Khatru-style relay design.
It is intentionally **not** about the new public API surface or the sync ACL model. Those live in `docs/slop/LOCAL_API.md` and `docs/SYNC.md`.
The focus here is runtime shape, protocol behavior, and operator-visible relay features.
---
## 1. Why This Matters
Khatru appears mature mainly because it exposes clearer relay pipeline stages.
That gives three practical benefits:
- less policy drift between storage, websocket, and management code,
- easier feature addition without hard-coding more branches into one connection module,
- better composability for relay profiles with different trust and traffic models.
Parrhesia should borrow that clarity without copying Khatru's code-first hook model wholesale.
---
## 2. Proposed Runtime Refactors
### 2.1 Staged policy pipeline
Parrhesia should stop treating policy as one coarse `EventPolicy` module plus scattered special cases.
Recommended internal stages:
1. connection admission
2. authentication challenge and validation
3. publish/write authorization
4. query/count authorization
5. stream subscription authorization
6. negentropy authorization
7. response shaping
8. broadcast/fanout suppression
This is an internal runtime refactor. It does not imply a new public API.
### 2.2 Richer internal request context
The runtime should carry a structured request context through all stages.
Useful fields:
- authenticated pubkeys
- caller kind
- remote IP
- subscription id
- peer id
- negentropy session flag
- internal-call flag
This reduces ad-hoc branching and makes audit/telemetry more coherent.
### 2.3 Separate policy from storage presence tables
Moderation state should remain data.
Runtime enforcement should be a first-class layer that consumes that data, not a side effect of whether a table exists.
This is especially important for:
- blocked IP enforcement,
- pubkey allowlists,
- future kind- or tag-scoped restrictions.
---
## 3. Protocol and Relay Features
### 3.1 Real COUNT sketches
Parrhesia currently returns a synthetic `hll` payload for NIP-45-style count responses.
If approximate count exchange matters, implement a real reusable HLL sketch path instead of hashing `filters + count`.
### 3.2 Relay identity in NIP-11
Once Parrhesia owns a stable server identity, NIP-11 should expose the relay pubkey instead of returning `nil`.
This is useful beyond sync:
- operator visibility,
- relay fingerprinting,
- future trust tooling.
### 3.3 Connection-level IP enforcement
Blocked IP support should be enforced on actual connection admission, not only stored in management tables.
This should happen early, before expensive protocol handling.
### 3.4 Better response shaping
Introduce a narrow internal response shaping layer for cases where returned events or counts need controlled rewriting or suppression.
Examples:
- hide fields for specific relay profiles,
- suppress rebroadcast of locally-ingested remote sync traffic,
- shape relay notices consistently.
This should stay narrow and deterministic. It should not become arbitrary app semantics.
---
## 4. Suggested Extension Points
These should be internal runtime seams, not necessarily public interfaces:
- `ConnectionPolicy`
- `AuthPolicy`
- `ReadPolicy`
- `WritePolicy`
- `NegentropyPolicy`
- `ResponsePolicy`
- `BroadcastPolicy`
They may initially be plain modules with well-defined callbacks or functions.
The point is not pluggability for its own sake. The point is to make policy stages explicit and testable.
---
## 5. Near-Term Priority
Recommended order:
1. enforce blocked IPs and any future connection-gating on the real connection path
2. split the current websocket flow into explicit read/write/negentropy policy stages
3. enrich runtime request context and telemetry metadata
4. expose relay pubkey in NIP-11 once identity lands
5. replace fake HLL payloads with a real approximate-count implementation if NIP-45 support matters operationally
This keeps the runtime improvements incremental and independent from the ongoing API and ACL implementation.

View File

@@ -84,6 +84,12 @@ Private key export should not be supported.
Sync traffic should use a real ACL layer, not moderation allowlists. Sync traffic should use a real ACL layer, not moderation allowlists.
Current implementation note:
- Parrhesia already has storage-backed moderation state such as `allowed_pubkeys` and `blocked_ips`,
- that is not the sync ACL model,
- sync protection must be enforced in the active websocket/query/count/negentropy/write path, not inferred from management tables alone.
Initial ACL model: Initial ACL model:
- principal: authenticated pubkey, - principal: authenticated pubkey,
@@ -110,6 +116,12 @@ Multiple pins should be allowed to support certificate rotation.
Each configured sync server represents one outbound worker managed by Parrhesia. Each configured sync server represents one outbound worker managed by Parrhesia.
Implementation note:
- Khatru-style relay designs benefit from explicit runtime stages,
- Parrhesia sync should therefore plug into clear internal phases for connection admission, auth, query/count, subscription, negentropy, publish, and fanout,
- this should stay a runtime refactor, not become extra sync semantics.
Minimum behavior: Minimum behavior:
1. connect to the remote relay, 1. connect to the remote relay,
@@ -332,11 +344,17 @@ The sync worker may attach request-context metadata such as:
```elixir ```elixir
%Parrhesia.API.RequestContext{ %Parrhesia.API.RequestContext{
caller: :sync, caller: :sync,
peer_id: "tribes-primary",
metadata: %{sync_server_id: "tribes-primary"} metadata: %{sync_server_id: "tribes-primary"}
} }
``` ```
That metadata is for telemetry and audit only. It must not become app sync semantics. Recommended additional context when available:
- `remote_ip`
- `subscription_id`
This context is for telemetry, policy, and audit only. It must not become app sync semantics.
--- ---

View File

@@ -64,6 +64,12 @@ Runtime internals
Rule: transport framing stays at the edge. Business decisions happen in `Parrhesia.API.*`. Rule: transport framing stays at the edge. Business decisions happen in `Parrhesia.API.*`.
Implementation note:
- the runtime beneath `Parrhesia.API.*` should expose clearer internal policy stages than it does today,
- at minimum: connection/auth, publish, query/count, stream subscription, negentropy, response shaping, and broadcast/fanout,
- these are internal runtime seams, not additional public APIs.
--- ---
## 4. Core Context ## 4. Core Context
@@ -73,12 +79,22 @@ defmodule Parrhesia.API.RequestContext do
defstruct authenticated_pubkeys: MapSet.new(), defstruct authenticated_pubkeys: MapSet.new(),
actor: nil, actor: nil,
caller: :local, caller: :local,
remote_ip: nil,
subscription_id: nil,
peer_id: nil,
metadata: %{} metadata: %{}
end end
``` ```
`caller` is for telemetry and policy parity, for example `:websocket`, `:http`, `:local`, or `:sync`. `caller` is for telemetry and policy parity, for example `:websocket`, `:http`, `:local`, or `:sync`.
Recommended usage:
- `remote_ip` for connection-level policy and audit,
- `subscription_id` for query/stream/negentropy context,
- `peer_id` for trusted sync peer identity when applicable,
- `metadata` for transport-specific details that should not become API fields.
--- ---
## 5. Public Modules ## 5. Public Modules
@@ -245,6 +261,12 @@ Purpose:
This is a real authorization layer, not a reuse of moderation allowlists. This is a real authorization layer, not a reuse of moderation allowlists.
Current implementation note:
- Parrhesia already has storage-backed moderation presence tables such as `allowed_pubkeys` and `blocked_ips`,
- those are not sufficient for sync ACLs,
- the new ACL layer must be enforced directly in the active read/write/query/negentropy path, not only through management tables.
```elixir ```elixir
@spec grant(map(), keyword()) :: :ok | {:error, term()} @spec grant(map(), keyword()) :: :ok | {:error, term()}
@spec revoke(map(), keyword()) :: :ok | {:error, term()} @spec revoke(map(), keyword()) :: :ok | {:error, term()}
@@ -343,6 +365,7 @@ Important constraints:
- Parrhesia must expose worker health and basic counters, - Parrhesia must expose worker health and basic counters,
- remote relay TLS pinning is required, - remote relay TLS pinning is required,
- sync peer auth is bound to a server-auth pubkey, not inferred from event author pubkeys. - sync peer auth is bound to a server-auth pubkey, not inferred from event author pubkeys.
- sync enforcement should reuse the same runtime policy stages as ordinary websocket traffic rather than inventing a parallel trust path.
Server identity model: Server identity model: