8.4 KiB
8.4 KiB
Parrhesia Nostr Relay Architecture
1) Goals
Build a robust, high-performance Nostr relay in Elixir/OTP with PostgreSQL as first adapter, while keeping a strict boundary so storage can be swapped later.
Primary targets:
- Broad relay feature support (core + modern relay-facing NIPs)
- Strong correctness around NIP-01 semantics
- Clear OTP supervision and failure isolation
- High fanout throughput and bounded resource usage
- Storage abstraction via behavior-driven ports/adapters
- Full test suite (unit, integration, conformance, perf, fault-injection)
- Support for experimental MLS flow (NIP-EE), behind feature flags
2) NIP support scope
Mandatory baseline
- NIP-01 (includes behavior moved from NIP-12/NIP-16/NIP-20/NIP-33)
- NIP-11 (relay info document)
Relay-facing features to include
- NIP-09 (deletion requests)
- NIP-13 (PoW gating)
- NIP-17 + NIP-44 + NIP-59 (private DMs / gift wraps)
- NIP-40 (expiration)
- NIP-42 (AUTH)
- NIP-43 (relay membership requests/metadata)
- NIP-45 (COUNT, optional HLL)
- NIP-50 (search)
- NIP-62 (request to vanish)
- NIP-66 (relay discovery events; store/serve as normal events)
- NIP-70 (protected events)
- NIP-77 (negentropy sync)
- NIP-86 + NIP-98 (relay management API auth)
Experimental MLS
- NIP-EE (unrecommended/upstream-superseded, but requested):
- kind
443KeyPackage events - kind
445group events (policy-controlled retention/ephemeral treatment) - kind
10051keypackage relay lists - interop with wrapped delivery (
1059) and auth/privacy policies
- kind
3) System architecture (high level)
WS/HTTP Edge (Bandit/Plug)
-> Protocol Decoder/Encoder
-> Command Router (EVENT/REQ/CLOSE/AUTH/COUNT/NEG-*)
-> Policy Pipeline (validation, auth, ACL, PoW, NIP-70)
-> Event Service / Query Service
-> Storage Port (behavior)
-> Postgres Adapter (Ecto)
-> Subscription Index (ETS)
-> Fanout Dispatcher
-> Telemetry + Metrics + Tracing
4) OTP supervision design
Parrhesia.Application children (top-level):
Parrhesia.Telemetry– metric definitions/reportersParrhesia.Config– runtime config cache (ETS-backed)Parrhesia.Storage.Supervisor– adapter processes (Repo, pools)Parrhesia.Subscriptions.Supervisor– subscription index + fanout workersParrhesia.Auth.Supervisor– AUTH challenge/session trackingParrhesia.Policy.Supervisor– rate limiters / ACL cachesParrhesia.Web.Endpoint– WS + HTTP ingressParrhesia.Tasks.Supervisor– background jobs (expiry purge, maintenance)
Failure model:
- Connection failures are isolated per socket process.
- Storage outages degrade with explicit
OK/CLOSEDerror prefixes (error:) per NIP-01. - Non-critical workers are
:transient; core infra is:permanent.
5) Core runtime components
5.1 Connection process
Per websocket connection:
- Parse frames, enforce max frame/message limits
- Maintain authenticated pubkeys (NIP-42)
- Track active subscriptions (
sub_idscoped to connection) - Handle backpressure (bounded outbound queue + drop/close strategy)
5.2 Command router
Dispatches:
EVENT-> ingest pipelineREQ-> initial DB query + live subscriptionCLOSE-> unsubscribeAUTH-> challenge validation, session updateCOUNT-> aggregate pathNEG-OPEN/NEG-MSG/NEG-CLOSE-> negentropy session engine
5.3 Event ingest pipeline
Ordered stages:
- Decode + schema checks
idrecomputation and signature verification- NIP semantic checks (timestamps, tag forms, size limits)
- Policy checks (banlists, kind allowlists, auth-required, NIP-70, PoW)
- Storage write (or no-store for ephemeral policy)
- Live fanout to matching subscriptions
- Return canonical
OKresponse with machine prefix when needed
5.4 Subscription index + fanout
- ETS-backed inverted indices (
kind,author, single-letter tags) - Candidate narrowing before full filter evaluation
- OR semantics across filters, AND within filter
limitonly for initial query phase; ignored in live phase (NIP-01)
5.5 Query service
- Compiles NIP filters into adapter-neutral query AST
- Pushes AST to storage adapter
- Deterministic ordering (
created_atdesc,idlexical tie-break) - Emits
EOSEexactly once per subscription initial catch-up
6) Storage boundary (swap-friendly by design)
6.1 Port/adapter contract
Define behaviors under Parrhesia.Storage:
Parrhesia.Storage.Eventsput_event/2,get_event/2,query/3,count/3delete_by_request/2,vanish/2,purge_expired/1
Parrhesia.Storage.Moderation- pubkey/event bans, allowlists, blocked IPs
Parrhesia.Storage.Groups- NIP-29/NIP-43 membership + role operations
Parrhesia.Storage.Admin- backing for NIP-86 methods
All domain logic depends only on these behaviors.
6.2 Postgres adapter notes
Initial adapter: Parrhesia.Storage.Adapters.Postgres with Ecto.
Schema outline:
events(id PK, pubkey, created_at, kind, content, sig, d_tag, deleted_at, expires_at)event_tags(event_id, name, value, idx)- moderation tables (banned/allowed pubkeys, banned events, blocked IPs)
- relay/group membership tables
- optional count/HLL helper tables
Indexing strategy:
(kind, created_at DESC)(pubkey, created_at DESC)(created_at DESC)(name, value, created_at DESC)onevent_tags- partial/unique indexes for replaceable and addressable semantics
Retention strategy:
- Optional table partitioning by time for hot pruning
- Periodic purge job for expired/deleted tombstoned rows
7) Feature-specific implementation notes
7.1 NIP-11
- Serve on WS URL with
Accept: application/nostr+json - Include accurate
supported_nipsandlimitation
7.2 NIP-42 + NIP-70
- Connection-scoped challenge store
- Protected (
["-"]) events rejected by default unless auth+pubkey match
7.3 NIP-17/59 privacy guardrails
- Relay can enforce recipient-only reads for kind
1059(AUTH required) - Query path validates requester access for wrapped DM fetches
7.4 NIP-45 COUNT
- Exact count baseline
- Optional approximate mode and HLL payloads for common queries
7.5 NIP-50 search
- Use Postgres FTS (
tsvector) with ranking - Apply
limitafter ranking
7.6 NIP-77 negentropy
- Track per-negentropy-session state in dedicated GenServer
- Use bounded resources + inactivity timeout
7.7 NIP-62 vanish
- Hard-delete all events by pubkey up to
created_at - Also delete matching gift wraps where feasible (
#ptarget) - Persist minimal audit record if needed for operations/legal trace
7.8 NIP-EE MLS (feature-flagged)
- Accept/store kind
443KeyPackage events - Process kind
445under configurable retention policy (default short TTL) - Ensure kind
10051replaceable semantics - Keep relay MLS-agnostic cryptographically (no MLS decryption in relay path)
8) Performance model
- Bounded mailbox and queue limits on connections
- ETS-heavy hot path (subscription match, auth/session cache)
- DB writes batched where safe; reads via prepared plans
- Avoid global locks; prefer partitioned workers and sharded ETS tables
- Telemetry-first tuning: p50/p95/p99 for ingest, query, fanout
- Expose Prometheus-compatible
/metricsendpoint for scraping
Targets (initial):
- p95 EVENT ack < 50ms under nominal load
- p95 REQ initial response start < 120ms on indexed queries
- predictable degradation under overload via rate-limit + backpressure
9) Testing strategy (full suite)
- Unit tests: parser, filter evaluator, policy predicates, NIP validators
- Property tests: filter semantics, replaceable/addressable conflict resolution
- Adapter contract tests: shared behavior tests run against Postgres adapter
- Integration tests: websocket protocol flows (
EVENT/REQ/CLOSE/AUTH/COUNT/NEG-*) - NIP conformance tests: machine-prefix responses, ordering, EOSE behavior
- MLS scenario tests: keypackage/group-event acceptance and policy handling
- Performance tests: soak + burst + large fanout profiles
- Fault-injection tests: DB outage, slow query, connection churn, node restart
10) Implementation principles
- Keep relay event-kind agnostic by default; special-case only where NIPs require
- Prefer explicit feature flags for expensive/experimental modules
- No direct Ecto usage outside Postgres adapter and migration layer
- Every feature lands with tests + telemetry hooks
Implementation task breakdown is tracked in ./PROGRESS.md.