docs: Study Khatru
This commit is contained in:
140
docs/KHATRU.md
Normal file
140
docs/KHATRU.md
Normal file
@@ -0,0 +1,140 @@
|
|||||||
|
# Khatru-Inspired Runtime Improvements
|
||||||
|
|
||||||
|
This document collects refactoring and extension ideas learned from studying Khatru-style relay design.
|
||||||
|
|
||||||
|
It is intentionally **not** about the new public API surface or the sync ACL model. Those live in `docs/slop/LOCAL_API.md` and `docs/SYNC.md`.
|
||||||
|
|
||||||
|
The focus here is runtime shape, protocol behavior, and operator-visible relay features.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Why This Matters
|
||||||
|
|
||||||
|
Khatru appears mature mainly because it exposes clearer relay pipeline stages.
|
||||||
|
|
||||||
|
That gives three practical benefits:
|
||||||
|
|
||||||
|
- less policy drift between storage, websocket, and management code,
|
||||||
|
- easier feature addition without hard-coding more branches into one connection module,
|
||||||
|
- better composability for relay profiles with different trust and traffic models.
|
||||||
|
|
||||||
|
Parrhesia should borrow that clarity without copying Khatru's code-first hook model wholesale.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Proposed Runtime Refactors
|
||||||
|
|
||||||
|
### 2.1 Staged policy pipeline
|
||||||
|
|
||||||
|
Parrhesia should stop treating policy as one coarse `EventPolicy` module plus scattered special cases.
|
||||||
|
|
||||||
|
Recommended internal stages:
|
||||||
|
|
||||||
|
1. connection admission
|
||||||
|
2. authentication challenge and validation
|
||||||
|
3. publish/write authorization
|
||||||
|
4. query/count authorization
|
||||||
|
5. stream subscription authorization
|
||||||
|
6. negentropy authorization
|
||||||
|
7. response shaping
|
||||||
|
8. broadcast/fanout suppression
|
||||||
|
|
||||||
|
This is an internal runtime refactor. It does not imply a new public API.
|
||||||
|
|
||||||
|
### 2.2 Richer internal request context
|
||||||
|
|
||||||
|
The runtime should carry a structured request context through all stages.
|
||||||
|
|
||||||
|
Useful fields:
|
||||||
|
|
||||||
|
- authenticated pubkeys
|
||||||
|
- caller kind
|
||||||
|
- remote IP
|
||||||
|
- subscription id
|
||||||
|
- peer id
|
||||||
|
- negentropy session flag
|
||||||
|
- internal-call flag
|
||||||
|
|
||||||
|
This reduces ad-hoc branching and makes audit/telemetry more coherent.
|
||||||
|
|
||||||
|
### 2.3 Separate policy from storage presence tables
|
||||||
|
|
||||||
|
Moderation state should remain data.
|
||||||
|
|
||||||
|
Runtime enforcement should be a first-class layer that consumes that data, not a side effect of whether a table exists.
|
||||||
|
|
||||||
|
This is especially important for:
|
||||||
|
|
||||||
|
- blocked IP enforcement,
|
||||||
|
- pubkey allowlists,
|
||||||
|
- future kind- or tag-scoped restrictions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Protocol and Relay Features
|
||||||
|
|
||||||
|
### 3.1 Real COUNT sketches
|
||||||
|
|
||||||
|
Parrhesia currently returns a synthetic `hll` payload for NIP-45-style count responses.
|
||||||
|
|
||||||
|
If approximate count exchange matters, implement a real reusable HLL sketch path instead of hashing `filters + count`.
|
||||||
|
|
||||||
|
### 3.2 Relay identity in NIP-11
|
||||||
|
|
||||||
|
Once Parrhesia owns a stable server identity, NIP-11 should expose the relay pubkey instead of returning `nil`.
|
||||||
|
|
||||||
|
This is useful beyond sync:
|
||||||
|
|
||||||
|
- operator visibility,
|
||||||
|
- relay fingerprinting,
|
||||||
|
- future trust tooling.
|
||||||
|
|
||||||
|
### 3.3 Connection-level IP enforcement
|
||||||
|
|
||||||
|
Blocked IP support should be enforced on actual connection admission, not only stored in management tables.
|
||||||
|
|
||||||
|
This should happen early, before expensive protocol handling.
|
||||||
|
|
||||||
|
### 3.4 Better response shaping
|
||||||
|
|
||||||
|
Introduce a narrow internal response shaping layer for cases where returned events or counts need controlled rewriting or suppression.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
- hide fields for specific relay profiles,
|
||||||
|
- suppress rebroadcast of locally-ingested remote sync traffic,
|
||||||
|
- shape relay notices consistently.
|
||||||
|
|
||||||
|
This should stay narrow and deterministic. It should not become arbitrary app semantics.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Suggested Extension Points
|
||||||
|
|
||||||
|
These should be internal runtime seams, not necessarily public interfaces:
|
||||||
|
|
||||||
|
- `ConnectionPolicy`
|
||||||
|
- `AuthPolicy`
|
||||||
|
- `ReadPolicy`
|
||||||
|
- `WritePolicy`
|
||||||
|
- `NegentropyPolicy`
|
||||||
|
- `ResponsePolicy`
|
||||||
|
- `BroadcastPolicy`
|
||||||
|
|
||||||
|
They may initially be plain modules with well-defined callbacks or functions.
|
||||||
|
|
||||||
|
The point is not pluggability for its own sake. The point is to make policy stages explicit and testable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Near-Term Priority
|
||||||
|
|
||||||
|
Recommended order:
|
||||||
|
|
||||||
|
1. enforce blocked IPs and any future connection-gating on the real connection path
|
||||||
|
2. split the current websocket flow into explicit read/write/negentropy policy stages
|
||||||
|
3. enrich runtime request context and telemetry metadata
|
||||||
|
4. expose relay pubkey in NIP-11 once identity lands
|
||||||
|
5. replace fake HLL payloads with a real approximate-count implementation if NIP-45 support matters operationally
|
||||||
|
|
||||||
|
This keeps the runtime improvements incremental and independent from the ongoing API and ACL implementation.
|
||||||
20
docs/SYNC.md
20
docs/SYNC.md
@@ -84,6 +84,12 @@ Private key export should not be supported.
|
|||||||
|
|
||||||
Sync traffic should use a real ACL layer, not moderation allowlists.
|
Sync traffic should use a real ACL layer, not moderation allowlists.
|
||||||
|
|
||||||
|
Current implementation note:
|
||||||
|
|
||||||
|
- Parrhesia already has storage-backed moderation state such as `allowed_pubkeys` and `blocked_ips`,
|
||||||
|
- that is not the sync ACL model,
|
||||||
|
- sync protection must be enforced in the active websocket/query/count/negentropy/write path, not inferred from management tables alone.
|
||||||
|
|
||||||
Initial ACL model:
|
Initial ACL model:
|
||||||
|
|
||||||
- principal: authenticated pubkey,
|
- principal: authenticated pubkey,
|
||||||
@@ -110,6 +116,12 @@ Multiple pins should be allowed to support certificate rotation.
|
|||||||
|
|
||||||
Each configured sync server represents one outbound worker managed by Parrhesia.
|
Each configured sync server represents one outbound worker managed by Parrhesia.
|
||||||
|
|
||||||
|
Implementation note:
|
||||||
|
|
||||||
|
- Khatru-style relay designs benefit from explicit runtime stages,
|
||||||
|
- Parrhesia sync should therefore plug into clear internal phases for connection admission, auth, query/count, subscription, negentropy, publish, and fanout,
|
||||||
|
- this should stay a runtime refactor, not become extra sync semantics.
|
||||||
|
|
||||||
Minimum behavior:
|
Minimum behavior:
|
||||||
|
|
||||||
1. connect to the remote relay,
|
1. connect to the remote relay,
|
||||||
@@ -332,11 +344,17 @@ The sync worker may attach request-context metadata such as:
|
|||||||
```elixir
|
```elixir
|
||||||
%Parrhesia.API.RequestContext{
|
%Parrhesia.API.RequestContext{
|
||||||
caller: :sync,
|
caller: :sync,
|
||||||
|
peer_id: "tribes-primary",
|
||||||
metadata: %{sync_server_id: "tribes-primary"}
|
metadata: %{sync_server_id: "tribes-primary"}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
That metadata is for telemetry and audit only. It must not become app sync semantics.
|
Recommended additional context when available:
|
||||||
|
|
||||||
|
- `remote_ip`
|
||||||
|
- `subscription_id`
|
||||||
|
|
||||||
|
This context is for telemetry, policy, and audit only. It must not become app sync semantics.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -64,6 +64,12 @@ Runtime internals
|
|||||||
|
|
||||||
Rule: transport framing stays at the edge. Business decisions happen in `Parrhesia.API.*`.
|
Rule: transport framing stays at the edge. Business decisions happen in `Parrhesia.API.*`.
|
||||||
|
|
||||||
|
Implementation note:
|
||||||
|
|
||||||
|
- the runtime beneath `Parrhesia.API.*` should expose clearer internal policy stages than it does today,
|
||||||
|
- at minimum: connection/auth, publish, query/count, stream subscription, negentropy, response shaping, and broadcast/fanout,
|
||||||
|
- these are internal runtime seams, not additional public APIs.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 4. Core Context
|
## 4. Core Context
|
||||||
@@ -73,12 +79,22 @@ defmodule Parrhesia.API.RequestContext do
|
|||||||
defstruct authenticated_pubkeys: MapSet.new(),
|
defstruct authenticated_pubkeys: MapSet.new(),
|
||||||
actor: nil,
|
actor: nil,
|
||||||
caller: :local,
|
caller: :local,
|
||||||
|
remote_ip: nil,
|
||||||
|
subscription_id: nil,
|
||||||
|
peer_id: nil,
|
||||||
metadata: %{}
|
metadata: %{}
|
||||||
end
|
end
|
||||||
```
|
```
|
||||||
|
|
||||||
`caller` is for telemetry and policy parity, for example `:websocket`, `:http`, `:local`, or `:sync`.
|
`caller` is for telemetry and policy parity, for example `:websocket`, `:http`, `:local`, or `:sync`.
|
||||||
|
|
||||||
|
Recommended usage:
|
||||||
|
|
||||||
|
- `remote_ip` for connection-level policy and audit,
|
||||||
|
- `subscription_id` for query/stream/negentropy context,
|
||||||
|
- `peer_id` for trusted sync peer identity when applicable,
|
||||||
|
- `metadata` for transport-specific details that should not become API fields.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 5. Public Modules
|
## 5. Public Modules
|
||||||
@@ -245,6 +261,12 @@ Purpose:
|
|||||||
|
|
||||||
This is a real authorization layer, not a reuse of moderation allowlists.
|
This is a real authorization layer, not a reuse of moderation allowlists.
|
||||||
|
|
||||||
|
Current implementation note:
|
||||||
|
|
||||||
|
- Parrhesia already has storage-backed moderation presence tables such as `allowed_pubkeys` and `blocked_ips`,
|
||||||
|
- those are not sufficient for sync ACLs,
|
||||||
|
- the new ACL layer must be enforced directly in the active read/write/query/negentropy path, not only through management tables.
|
||||||
|
|
||||||
```elixir
|
```elixir
|
||||||
@spec grant(map(), keyword()) :: :ok | {:error, term()}
|
@spec grant(map(), keyword()) :: :ok | {:error, term()}
|
||||||
@spec revoke(map(), keyword()) :: :ok | {:error, term()}
|
@spec revoke(map(), keyword()) :: :ok | {:error, term()}
|
||||||
@@ -343,6 +365,7 @@ Important constraints:
|
|||||||
- Parrhesia must expose worker health and basic counters,
|
- Parrhesia must expose worker health and basic counters,
|
||||||
- remote relay TLS pinning is required,
|
- remote relay TLS pinning is required,
|
||||||
- sync peer auth is bound to a server-auth pubkey, not inferred from event author pubkeys.
|
- sync peer auth is bound to a server-auth pubkey, not inferred from event author pubkeys.
|
||||||
|
- sync enforcement should reuse the same runtime policy stages as ordinary websocket traffic rather than inventing a parallel trust path.
|
||||||
|
|
||||||
Server identity model:
|
Server identity model:
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user