data versioning + per-record universes + cross-instance schema-version gating#458
Merged
Merged
Conversation
…-version gating
Renames data.sample/ → data.reference/ to reflect its actual role: the
canonical reference / seed source consumed by setup-data.js and by
migrations that look up shipped versions of files.
Adds server/lib/collectionStore.js — a per-type, per-record JSON storage
helper with explicit type-level schemaVersion stamping. Provides loadOne /
saveOne / listIds / loadAll / deleteOne / loadTypeIndex / saveTypeIndex /
verifySchemaVersion. Per-id write queue means writes to different records
run in parallel. server/index.js gains a boot-time verifyCollectionVersions
call after runMigrations so a missed migration produces a loud single-line
log per collection.
Migration 034 splits the monolithic data/universe-builder.json (1.4MB / ~30
universes that previously rewrote in full on every mutation) into
data/universes/{id}/index.json with a type-level data/universes/index.json
carrying { schemaVersion: 5, type, updatedAt, config: { runs } }. Legacy
file is renamed to .bak-034 (not deleted) as a recovery path. Idempotent;
handles partial-completion recovery. universeBuilder.js is rewritten to
consume collectionStore; per-record edits no longer serialize against
unrelated universe edits.
Introduces a schema-version contract for cross-instance sync via
server/lib/schemaVersions.js: PORTOS_SCHEMA_VERSIONS = { universes: 5 } is
the per-category storage-layout contract. Every outbound payload (federated
peer push, 60s snapshot sync, share-bucket manifest) carries a portosMeta
{ portosVersion, schemaVersions } envelope. Receivers run
compareSchemaVersions(sender, local); when the sender is ahead on any
category the receiver rejects:
- federated peer push returns 409 with structured details
- snapshot sync surfaces blockedBySchema on the peer record (schemaGaps)
- share-bucket import skips the manifest with reason 'portos-schema-ahead'
and emits a sharing:portos-schema-ahead socket event
Senders persist the gap (sub.blockedBySchema for per-record subs;
peer.schemaGaps for snapshot direction) and pause retries with a 5-minute
cooldown; peer:online re-probes by bypassing the cooldown. Legacy senders
without portosMeta pass through unchanged. The Instances page renders a
new SchemaGapBadge per peer card explaining the gap in both directions
('Peer X is on an older PortOS — they need to update before we can sync
universes' / 'Peer X is on a newer PortOS — update PortOS to receive their
universe updates'). Tombstone deletes are exempt from the gate so
soft-deletes always converge even across version mismatches.
Code-review-driven hardening before merge:
- deleteUniverse and mergeUniversesFromSync now do their runs[] cleanup
fully inside queueTypeIndexWrite (load + filter + atomicWrite) instead
of an outer-load → queued saveTypeIndex pattern that races concurrent
recordRun
- pruneTombstonedUniverses re-checks tombstone status inside each per-id
queue slot so a concurrent peer-sync un-delete can't be silently clobbered
- updatePeer accepts the new schemaGaps field; syncOrchestrator passes
peer.id (not peer.instanceId) so the write actually lands
- portosMetaSchema uses .passthrough() so future PortOS versions adding
meta fields don't get blanket-400'd before the schema-version gate runs
- POST /api/sync/:category/apply validates its body with Zod and forwards
portosMeta so external callers can't bypass the gate
- share-bucket import marks portos-schema-ahead manifests as processed
to prevent every chokidar fan-out from re-emitting the socket event
- migration 034's pre-flight uses readdir({withFileTypes: true}) +
isDirectory() so stray non-directory entries don't ENOTDIR-crash the
migration mid-loop
- collectionStore.listIds uses lstat so symlinks aren't accepted as
record dirs (closes the foot-gun where a future deleteOne would rm -rf
through the link)
- syncOrchestrator imports updatePeer statically (was an unnecessary
dynamic import — no circular dep)
- schema gate in applyIncomingPush runs AFTER identity + record-shape
validation so malformed callers don't receive the version-fingerprint
disclosure in the 409 body
- SchemaGapBadge normalizes the push-direction category to match the
snapshot direction's vocabulary so the same gap doesn't render twice
with inconsistent labels
- Removes dead STATIC_PORTOS_VERSION constant
…fresh - peerSync.js: receiver includes its own PortOS version in 409 details (`receiverPortosVersion`) so the sender's SchemaGapBadge labels the rejecting peer correctly instead of round-tripping the sender's version - peerSync.js: sender retries push without `portosMeta` envelope when the peer's strict push schema 400-rejects the unknown field (pre-version-gate peers during a federation rollout) - Instances.jsx: PeerCard subscribes to peerSync:subscription-blocked / unblocked events to refresh its peerSubs so SchemaGapBadge picks up the new blockedBySchema field without a page reload
Five items surfaced by gemini reviewing files outside the data-versioning PR (cos.js, cosRunnerClient.js, cosAgents.js, subAgentSpawner.js, taskParser.js). Logged under "Trigger-gated / Blocked on decision" so they can be picked up in scope-appropriate PRs.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces per-type/per-record data versioning (universes first), renames the seed data directory from data.sample/ to data.reference/, and adds cross-instance schema-version gating across all sync transports (peer push, snapshot sync, share-bucket).
Changes:
- Rename seed/reference data folder and update all code/tests/docs to use
data.reference/. - Add schema-version envelopes + compatibility gating to peer push, snapshot sync, and share-bucket imports/exports, with UI surfacing via
SchemaGapBadge. - Refactor universe storage to a per-record layout (migration + service updates), plus boot-time collection schema verification.
Reviewed changes
Copilot reviewed 82 out of 169 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| server/services/universeCanon.test.js | Switch tests from in-memory file store mocks to tmpdir-backed real FS behavior (collectionStore compatible). |
| server/services/universeBuilderPromote.test.js | Update promote tests to tmpdir FS fixtures; tighten atomic-write assertion via spy filtering. |
| server/services/syncOrchestrator.js | Forward portosMeta into snapshot apply; persist/clear per-peer schema gaps via updatePeer. |
| server/services/sharing/manifest.js | Stamp share manifests with portosSchemaVersions for share-bucket schema gating. |
| server/services/sharing/integration.test.js | Add integration coverage for share-bucket schema-ahead gating behavior. |
| server/services/sharing/index.js | Emit new socket events for schema-ahead share imports and peer-sync subscription block/unblock. |
| server/services/sharing/importer.js | Enforce PortOS schema gating during manifest import and emit UI-facing events on block. |
| server/services/sharing/exporter.js | Stamp exported manifests with local PORTOS_SCHEMA_VERSIONS. |
| server/services/sharing/annotationsSync.js | Stamp annotation manifests with PORTOS_SCHEMA_VERSIONS. |
| server/services/pipeline/textStages.js | Update prompt-template path references to data.reference/. |
| server/services/instances.js | Extend updatePeer() allowlist to persist schemaGaps. |
| server/services/brain.js | Update seeded providers path reference to data.reference/. |
| server/services/autonomousJobs.syncSkillTemplates.test.js | Update test fixture paths to data.reference/ for skill template seeding. |
| server/services/autonomousJobs.js | Update skill-template seed path and comments to data.reference/. |
| server/routes/peerSync.test.js | Add route mapping test for schema-ahead service error → HTTP 409 with details in body. |
| server/routes/peerSync.js | Map schema-ahead errors to 409 and preserve service details in the error response context. |
| server/routes/dataSync.js | Add Zod validation for apply payload and forward portosMeta to applyRemote. |
| server/lib/validation.js | Add portosMeta Zod schema for peer-sync push payloads (.passthrough() for forward-compat). |
| server/lib/README.md | Document new collectionStore.js and schemaVersions.js modules. |
| server/lib/mediaModels.test.js | Update seed registry path + test naming to data.reference/. |
| server/lib/index.js | Export new collectionStore.js and schemaVersions.js from the lib barrel. |
| server/lib/fileUtils.test.js | Update template seed path in tests to data.reference/. |
| server/lib/fileUtils.js | Update comments to reference data.reference/ as the shipped template source. |
| server/lib/creativeDirectorPrompts.test.js | Update fallback seed directory to data.reference/. |
| server/lib/comicScriptParser.js | Update template path references to data.reference/. |
| server/index.js | Rename sample dir constant to reference dir; verify collection schema versions after migrations at boot. |
| scripts/setup-data.js | Rename sampleDir → referenceDir and update copy/merge/drift detection to data.reference/. |
| scripts/migrations/032-claude-default-opus-4-7.test.js | Update test wording to data.reference. |
| scripts/migrations/032-claude-default-opus-4-7.js | Update migration comments/logs to data.reference/. |
| scripts/migrations/027-text-stage-prompts-entities-summary.test.js | Update drift-catch fixture reads to data.reference/. |
| scripts/migrations/027-text-stage-prompts-entities-summary.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/025-idea-stage-character-detail.js | Update comments and manual-diff hint paths to data.reference/. |
| scripts/migrations/023-resolve-prompt-episode-anchor.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/022-rename-bible-setting-to-place.js | Update sampleRel paths and logs to data.reference/. |
| scripts/migrations/022-character-extended-fields.js | Update seed file paths + warnings to data.reference/. |
| scripts/migrations/020-importer-screenplay-user-requested-count.js | Update warnings/seed path references to data.reference/. |
| scripts/migrations/020-comic-cover-concepts-stage.js | Update seed/config path references and warnings to data.reference/. |
| scripts/migrations/019-arc-volume-prompts-canon-context.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/018-rename-writers-room-settings-stage.js | Update seed/config path references and logs to data.reference/. |
| scripts/migrations/017-volume-cover-concepts-stage.js | Update seed/config path references and warnings to data.reference/. |
| scripts/migrations/016-importer-fence-source.js | Update warnings/seed path references to data.reference/. |
| scripts/migrations/015-importer-stage-prompts.js | Update seed/config path references and warnings to data.reference/. |
| scripts/migrations/013-comic-script-back-cover.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/012-mark-flux1-dev-gated.js | Update comments/logs referencing fresh installs to data.reference/. |
| scripts/migrations/010-simplify-cos-agent-briefing.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/008-default-to-claude-code-tui.js | Update logs referencing fresh installs to data.reference/. |
| scripts/migrations/007-places-int-ext-time-of-day.js | Update seed path references and manual-diff hints to data.reference/. |
| scripts/migrations/006-extract-scenes-shots.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/005-shape-aware-arc-prompts.js | Update comments/manual-diff hints to data.reference/. |
| scripts/migrations/004-augment-idea-expansion-context.js | Update manual-diff hint paths to data.reference/. |
| scripts/migrations/003-update-pipeline-stage-prompts.test.js | Update comments referencing seed counterpart to data.reference/. |
| scripts/migrations/003-update-pipeline-stage-prompts.js | Update comments/manual-diff hints to data.reference/. |
| scripts/migrations/002-cd-evaluate-image-strength.js | Update logs/manual-merge hints to data.reference/. |
| scripts/migrations/_testHelpers.js | Update fixture reads and drift-catch wording to data.reference/. |
| scripts/migrations/_lib.test.js | Update test fixture seed dir path to data.reference/. |
| scripts/migrations/_lib.js | Update migration helper seed dir and warnings to data.reference/. |
| README.md | Rename the repo tree entry from data.sample/ to data.reference/. |
| PLAN.md | Add planning items (captured review findings) related to CoS robustness; unrelated to the data-versioning core. |
| docs/TROUBLESHOOTING.md | Update manual recovery copy command to data.reference/. |
| docs/features/writers-room.md | Update docs paths to data.reference/. |
| docs/features/openclaw-operator-chat-audit.md | Update sample config path to data.reference/. |
| docs/features/brain/prompt.md | Update sample brain data path to data.reference/. |
| docs/features/brain/plan.md | Update sample brain data path references to data.reference/. |
| data.reference/usage.json | New seed/reference usage file. |
| data.reference/TASKS.md | New seed/reference tasks file. |
| data.reference/settings.json | New seed/reference settings file. |
| data.reference/runs/.gitkeep | Ensure runs dir exists in seed/reference. |
| data.reference/prompts/variables.json | New seed/reference prompt variables file. |
| data.reference/prompts/stages/writers-room-script.md | New seed/reference prompt template. |
| data.reference/prompts/stages/writers-room-places.md | New seed/reference prompt template. |
| data.reference/prompts/stages/writers-room-objects.md | New seed/reference prompt template. |
| data.reference/prompts/stages/writers-room-format.md | New seed/reference prompt template. |
| data.reference/prompts/stages/writers-room-evaluate.md | New seed/reference prompt template. |
| data.reference/prompts/stages/writers-room-characters.md | New seed/reference prompt template. |
| data.reference/prompts/stages/universe-character-expand.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-volume-cover-concepts.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-teleplay.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-storyboard-image-prompt.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-series-title-logo.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-season-episodes.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-prose.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-extract-scenes.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-comic-script.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-comic-panel-image-prompt.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-comic-cover-concepts.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-character-refine.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-character-differentiate-cast.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-arc-verify.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-arc-resolve.md | New seed/reference prompt template. |
| data.reference/prompts/stages/pipeline-arc-overview.md | New seed/reference prompt template. |
| data.reference/prompts/stages/memory-evaluate.md | New seed/reference prompt template. |
| data.reference/prompts/stages/importer-issue-proposal.md | New seed/reference prompt template. |
| data.reference/prompts/stages/importer-classify.md | New seed/reference prompt template. |
| data.reference/prompts/stages/importer-canon-extract.md | New seed/reference prompt template. |
| data.reference/prompts/stages/importer-arc-extract.md | New seed/reference prompt template. |
| data.reference/prompts/stages/cos-self-improvement.md | New seed/reference prompt template. |
| data.reference/prompts/stages/cos-report-summary.md | New seed/reference prompt template. |
| data.reference/prompts/stages/cos-evaluate.md | New seed/reference prompt template. |
| data.reference/prompts/stages/cos-agent-briefing.md | New seed/reference prompt template. |
| data.reference/prompts/stages/cd-treatment.md | New seed/reference prompt template. |
| data.reference/prompts/stages/brain-weekly-review.md | New seed/reference prompt template. |
| data.reference/prompts/stages/brain-daily-digest.md | New seed/reference prompt template. |
| data.reference/prompts/stages/brain-classifier.md | New seed/reference prompt template. |
| data.reference/prompts/stages/app-detection.md | New seed/reference prompt template. |
| data.reference/prompts/skills/security-audit.md | New seed/reference skill template. |
| data.reference/prompts/skills/refactor.md | New seed/reference skill template. |
| data.reference/prompts/skills/mobile-responsive.md | New seed/reference skill template. |
| data.reference/prompts/skills/jobs/jira-sprint-manager.md | New seed/reference job template. |
| data.reference/prompts/skills/jobs/datadog-error-monitor.md | New seed/reference job template. |
| data.reference/prompts/skills/feature.md | New seed/reference skill template. |
| data.reference/prompts/skills/feature-agent.md | New seed/reference skill template. |
| data.reference/prompts/skills/documentation.md | New seed/reference skill template. |
| data.reference/prompts/skills/bug-fix.md | New seed/reference skill template. |
| data.reference/prompts/_partials/scene-output-contract.md | New seed/reference prompt partial. |
| data.reference/prompts/_partials/bible-deference.md | New seed/reference prompt partial. |
| data.reference/openclaw/config.json | New seed/reference OpenClaw config placeholder. |
| data.reference/memory-classifier-config.json | New seed/reference memory classifier config. |
| data.reference/history.json | New seed/reference history file. |
| data.reference/digital-twin/meta.json | New seed/reference digital twin metadata. |
| data.reference/digital-twin/DIGITAL_TWIN.md | New seed/reference digital twin scaffold document. |
| data.reference/cos/state.json | New seed/reference CoS initial state. |
| data.reference/cos/scripts/.gitkeep | Ensure CoS scripts dir exists in seed/reference. |
| data.reference/cos/reports/.gitkeep | Ensure CoS reports dir exists in seed/reference. |
| data.reference/cos/memory/index.json | New seed/reference CoS memory index. |
| data.reference/cos/memory/embeddings.json | New seed/reference CoS embeddings placeholder. |
| data.reference/cos/agents/.gitkeep | Ensure CoS agents dir exists in seed/reference. |
| data.reference/COS-TASKS.md | New seed/reference CoS tasks file. |
| data.reference/browser-config.json | New seed/reference browser config. |
| data.reference/brain/reviews.jsonl | New seed/reference brain reviews log. |
| data.reference/brain/projects.json | New seed/reference brain projects store. |
| data.reference/brain/people.json | New seed/reference brain people store. |
| data.reference/brain/meta.json | New seed/reference brain metadata. |
| data.reference/brain/inbox_log.jsonl | New seed/reference brain inbox log. |
| data.reference/brain/ideas.json | New seed/reference brain ideas store. |
| data.reference/brain/digests.jsonl | New seed/reference brain digests log. |
| data.reference/brain/admin.json | New seed/reference brain admin store. |
| data.reference/autofixer/sessions/.gitkeep | Ensure autofixer sessions dir exists in seed/reference. |
| data.reference/autofixer/index.json | New seed/reference autofixer index. |
| data.reference/apps.json | New seed/reference apps config with __PORTOS_ROOT__ placeholder. |
| client/src/pages/Instances.jsx | Render schema-gap badge and refresh peer subs/peers on schema block/unblock socket events. |
| client/src/components/instances/SchemaGapBadge.test.jsx | Unit tests for schema gap badge rendering and dedup behavior. |
| client/src/components/instances/SchemaGapBadge.jsx | New UI component to display peer schema version mismatches from both snapshot and push transports. |
| .planning/codebase/STRUCTURE.md | Update planning docs to refer to data.reference/. |
| .planning/codebase/ARCHITECTURE.md | Update planning docs to refer to data.reference/. |
| .gitignore | Update ignore exceptions to match data.reference/ templates. |
| .changelog/NEXT.md | Document new storage layout, schema gating behavior, and the data.reference/ rename. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Six items deferred from PR #458 (universe-builder per-record POC) that weren't moved into PLAN.md when the work landed: - Split pipeline-issues.json (biggest concurrency win — comic pipeline) - Split pipeline-series.json (consistency once issues is split) - history.json JSONL or date-partitioning when it grows - Centralize cosAgents + settings on atomicWrite + createFileWriteQueue - Cache invalidation hooks when Brain / Messages collections split - Pick a convention for the index.json config slot
…t trigger-gated - Move pipeline-issues / pipeline-series splits + history JSONL conversion to Next Up (they're shape-mismatches, not perf problems waiting to happen) - Keep cache-invalidation and typeindex-config conventions in Blocked-on-decision since they genuinely need their first consumer or a dependent split to land first
…d event refetch - importer.js: fix stale comment that claimed 'reject without marking processed' — the schema-ahead branch DOES markProcessed (chokidar dedup); retry-after-upgrade is manual via cursor clear, now stated correctly - manifest.js + importer.js: use shared isPlainObject for portosSchemaVersions so arrays can't spread into numeric-keyed garbage (typeof==='object' alone accepted arrays); regression test added - dataSync.js: reject only missing/null data (data == null), not falsy-but- valid payloads (0/false/'') that the z.unknown() schema allows - peerSync.js: emit peerId alongside subId on subscription-blocked/unblocked so each Instances PeerCard refetches only for its own peer instead of all cards refetching on every event
…age listener - mockPathsDataRoot: add wrapExports + makeSpy so tests count fileUtils writes via a delegating vi.fn instead of vi.spyOn on a read-only ESM namespace export (which throws in Vitest). promote atomicity test now uses spies.atomicWrite; removed the dead fileUtils import - Instances.jsx: drop the page-level peerSync:subscription-blocked/unblocked listener — those events only touch per-record subscription state, already refetched per-card; the page-level fetchData() was a redundant all-peers refetch on every block/unblock
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
Per-type, per-record storage (universes POC)
Cross-instance schema-version gating
Test infrastructure
Test plan