Files
self 9cd31ad340
CI / Test (push) Failing after 21s
feat: add Kobold dataset foundation
Add fixed Ash resources for datasets, resource definitions, dataset events, and local record projections. Expose a loopback smoke API for public synced datasets and private local-only datasets, with migrations and host-backed tests.
2026-05-28 00:48:57 +02:00

659 lines
13 KiB
Markdown

Kobold Plugin MVP PRD
Overview
Kobold is a distributed dataset and lightweight data-application plugin for Tribes OS.
It combines:
* spreadsheet-like usability
* structured CRUD data modeling
* event-sourced history
* fork/merge workflows
* composable views and forms
* lightweight deterministic formulas/macros
* distributed replication between tribes
The goal is not to recreate Excel, Airtable, or a traditional relational database directly.
The goal is to create a shared, replicable, semantically structured data layer for communities.
Kobold datasets should feel:
* approachable like spreadsheets
* structured like databases
* composable like HyperCard
* distributable like Git
* signed and lineage-aware like Nostr
The MVP focuses on:
* typed datasets
* generic CRUD
* distributed replication
* dataset forks
* deterministic formulas
* simple merge workflows
* generic UI rendering
Advanced automation, semantic federation, and collaborative realtime editing are explicitly out of scope for MVP.
Goals
Primary Goals
1. Allow tribes to create structured datasets.
2. Allow datasets to replicate between tribes.
3. Allow datasets to be forked and modified independently.
4. Allow generic CRUD interaction without custom code.
5. Allow plugins to provide specialized dataset UIs.
6. Support deterministic computed fields and validations.
7. Preserve provenance and lineage.
8. Enable embedding datasets into CMS pages and applications.
Non-Goals
The MVP does NOT attempt to provide:
* CRDT-based collaborative editing
* fully automatic conflict resolution
* arbitrary user scripting with unrestricted capabilities
* arbitrary SQL execution
* distributed transactions
* realtime multi-user spreadsheet editing
* semantic web / RDF systems
* complex workflow automation
* distributed query execution
* cross-node transactional consistency
Conceptual Model
Dataset
A dataset is a replicable collection of typed resources.
Examples:
* seed catalogs
* member directories
* inventories
* local marketplaces
* event collections
* media metadata libraries
* mapping data
* recipe databases
Datasets are:
* versioned
* signed
* forkable
* lineage-aware
* independently replicable
A dataset is the primary unit of:
* replication
* permissions
* discovery
* forking
* merging
* plugin compatibility
Resource
A resource defines a typed entity within a dataset.
Examples:
* SeedVariety
* Supplier
* Member
* Event
* Product
Resources contain:
* fields
* relationships
* computed fields
* validations
* forms
* views
Record
A record is a concrete resource instance.
Example:
resource: SeedVariety
id: 01HX...
fields:
species: Solanum lycopersicum
variety_name: Black Krim
germination_days: 8
Records are event-sourced and versioned.
Relationship
Relationships connect records.
Examples:
* SeedVariety -> Supplier
* Event -> Venue
* Product -> Category
Relationships may reference records in:
* the same dataset
* external datasets
* forks
Dataset Fork
A fork creates an independently writable lineage from an existing dataset.
Forks preserve:
* origin dataset reference
* record lineage
* version ancestry
Forks may later:
* diverge permanently
* partially merge
* submit merge proposals upstream
MVP Architecture
First-Cut Architecture Decision
The first implementation should use fixed Ash resources as the durable plugin substrate and represent Kobold resources dynamically as dataset schema data.
In other words, a user-defined resource such as `SeedVariety` is not an Elixir module, an Ash resource module, or a dedicated Postgres table in the MVP. It is a `ResourceDefinition` inside a dataset, and its records are created by appending dataset events and updating a local projection.
This keeps the first cut focused on the core distributed artifact model:
* stable synced storage primitives
* append-only provenance
* generic schema-driven CRUD
* portable event envelopes
* tribe-to-tribe replication
* fork and merge lineage
Runtime-generated Ash resources, per-resource database tables, and runtime schema migrations are out of scope for the MVP unless a later optimization explicitly requires them.
Core Components
1. Dataset Registry
Stores:
* dataset metadata
* ownership
* replication configuration
* schema references
* plugin compatibility
* lineage metadata
2. Resource Engine
Provides:
* typed resource definitions
* CRUD operations
* validations
* relationships
* query abstraction
The MVP resource engine interprets schema metadata stored in `ResourceDefinition` records. It may use Ash for the fixed persistence resources, but it should not generate runtime Ash modules for user-defined dataset resources.
3. Event Store
Stores append-only dataset changes.
Examples:
* record created
* record updated
* relationship added
* schema migrated
* merge accepted
The event log is the source of provenance and synchronization. Current record state is a projection of the event log, not the authoritative source of truth.
4. Replication Engine
Handles:
* dataset synchronization
* event distribution
* snapshot syncing
* lineage tracking
* version checkpoints
Replication is mediated through a small tribe-to-tribe interop layer built around signed dataset manifests, schema bundles, event envelopes, snapshots, and merge proposals. Nostr is the initial transport, but the envelope format should remain transport-neutral.
5. Formula Runtime
Executes deterministic formulas/macros.
Initial runtime target:
* JavaScript via QuickJS / QuickBEAM
Requirements:
* deterministic execution
* no network access
* no filesystem access
* no unrestricted host calls
* memory/time limits
* immutable inputs
6. Generic CRUD UI
Provides:
* table views
* detail views
* forms
* filtering
* sorting
* relationship navigation
This acts as the default interface for any compatible dataset.
7. Plugin Compatibility Layer
Allows plugins to:
* declare supported dataset types
* register specialized views/editors
* embed dataset blocks into CMS pages
Static MVP Storage Model
The first cut should use a small set of fixed plugin-owned resources:
* `Dataset` — dataset metadata, ownership, visibility, replication mode, and lineage pointers
* `ResourceDefinition` — schema-as-data for each dynamic resource in a dataset
* `DatasetEvent` — append-only signed change events and provenance metadata
* `RecordProjection` — local queryable current state derived from events
* `MergeProposal` — human-reviewed fork merge requests and decisions
`Dataset`, `ResourceDefinition`, `DatasetEvent`, and `MergeProposal` are the initial candidates for synced Ash resources. `RecordProjection` can be local-only and rebuildable at first, because it is derived from the event log.
Dataset Metadata
Example:
id: 018f2f2f...
name: Community Seed Catalog
owner_pubkey: npub1...
type: org.tribes.scv.seed-catalog
schema: scv.seed_catalog
schema_version: 1.0.0
visibility: public
replication:
mode: public-read
preferred_plugins:
- org.tribes.scv
- org.tribes.kobold
Resource Definition Example
resource: SeedVariety
fields:
species:
type: string
required: true
variety_name:
type: string
required: true
germination_days:
type: integer
computed_fields:
display_name:
formula: |
`${record.species} · ${record.variety_name}`
relationships:
suppliers:
type: many
target: Supplier
Formula System
Goals
Provide lightweight spreadsheet-like behavior without exposing unrestricted scripting.
Supported Use Cases
* computed fields
* validations
* formatting
* derived values
* lightweight transforms
Explicitly Unsupported in MVP
* arbitrary IO
* external HTTP requests
* unrestricted DB access
* filesystem access
* timers/background jobs
* arbitrary host execution
Example
export function compute(record) {
if (record.total === 0) return null
return record.sprouted / record.total
}
Cross-Dataset References
Datasets may reference records from external datasets.
Example:
supplier_ref:
dataset_id: 018f...
resource: Supplier
record_id: 01AB...
Reference modes:
* strong
* soft
* pinned-version
Replication Model
Tribe-to-Tribe Interop Layer
The MVP interop layer should be protocol-shaped and independent from UI code. It moves signed dataset artifacts between tribes and applies verified events to local projections.
Initial envelope types:
* `DatasetManifest`
* `SchemaBundle`
* `EventEnvelope`
* `Snapshot`
* `MergeProposal`
Initial transport adapter:
* Nostr-backed publish/subscribe and snapshot bootstrap
The interop layer is responsible for signature/hash verification, lineage checks, idempotent event application, and projection rebuild triggers. It should not know about LiveView pages or specialized dataset renderers.
Initial Replication Strategy
MVP uses:
* append-only event synchronization
* snapshot bootstrap
* latest-write-wins conflict handling where appropriate
* explicit merge proposals for divergent forks
Initial Trust Model
Datasets are signed and associated with:
* owner pubkeys
* tribe identities
* optional trust metadata
Merge Workflow
MVP merge flow:
1. Dataset fork created.
2. Fork diverges independently.
3. Fork proposes merge upstream.
4. Upstream reviews changes.
5. Upstream accepts/rejects merge.
MVP merge resolution is intentionally human-centered.
No automatic semantic merge engine is required initially.
Generic UI
Table View
Provides:
* sortable columns
* filtering
* pagination
* inline editing
* relationship links
Record View
Provides:
* structured field rendering
* relationship navigation
* change history
* provenance display
Form View
Provides:
* generated forms from schema
* validation support
* formula-assisted fields
CMS Integration
Datasets should integrate with CMS pages.
Example blocks:
* dataset table block
* record detail block
* query block
* generated forms
* charts/statistics
Example:
block:
type: dataset_table
dataset: community_seed_catalog
resource: SeedVariety
filter:
species: Solanum lycopersicum
Permissions
Initial permission levels:
* public read
* tribe read
* contributor write
* maintainer write
* owner/admin
Permissions apply at:
* dataset level
* resource level
* potentially record level later
Technical MVP Stack
Backend
* Elixir
* Phoenix
* LiveView
* Ash Framework
* PostgreSQL
* Oban
Formula Runtime
* QuickJS via QuickBEAM
Replication
* Nostr-based event distribution
* local SQL projections
Frontend
* LiveView-first
* plugin-rendered blocks/components
Out of Scope for MVP
The following are intentionally postponed:
* realtime collaborative editing
* CRDT synchronization
* offline-first conflict-free editing
* distributed transactions
* semantic query federation
* arbitrary plugin scripting APIs
* GraphQL federation
* AI-assisted merge resolution
* automatic ontology mapping
* advanced analytics engines
* spreadsheet compatibility import/export beyond basic CSV
Future Directions
Potential future features:
* dataset marketplaces
* trust/reputation overlays
* semantic schema mappings
* advanced merge tooling
* WASM runtimes
* visual formula builders
* AI-assisted data cleanup
* cross-dataset graph traversal
* deterministic workflow engines
* local-first mobile replicas
* dataset package registries
* public lineage explorers
First Implementation Milestones
1. Local generic dataset CRUD
* create a dataset
* define one dynamic resource with typed fields
* create, update, delete, list, and view records through generic UI
* append a `DatasetEvent` for every mutation
* maintain a local `RecordProjection`
First-cut status: implemented as plugin-owned Ash resources and a loopback
JSON smoke API. Public mutations publish through `AshNostrSync`; private
mutations are local-only. The LiveView remains a placeholder until the schema
driven generic UI is added.
2. Manual event portability
* export a dataset manifest, schema bundle, and event stream
* import them into another tribe
* rebuild the same local projection on the receiving tribe
3. Nostr-backed replication
* publish and subscribe to dataset events
* bootstrap from a snapshot when needed
* verify provenance before applying events
4. Fork and merge proposal flow
* fork a dataset lineage
* edit the fork independently
* propose a merge upstream
* review and accept or reject the event delta manually
MVP Success Criteria
The MVP is considered successful if:
1. A tribe can create a typed dataset.
2. Another tribe can replicate it.
3. Another tribe can fork it.
4. Generic CRUD works without custom code.
5. Computed fields and validations work safely.
6. Dataset views can be embedded into CMS pages.
7. Merge proposals can be reviewed manually.
8. Provenance and lineage remain inspectable.
9. Plugins can provide specialized UIs.
10. The system feels approachable to non-programmers.
Core Philosophy
Kobold is not merely a database.
It is a distributed, composable, lineage-aware data artifact system for communities.
The design should prioritize:
* transparency
* inspectability
* composability
* determinism
* portability
* human governance
* graceful forking
* approachable tooling
The system should feel closer to:
* shared knowledge artifacts
* living data books
* community-maintained repositories
than to traditional enterprise CRUD software.