v1.2.02026-06-12

Compounding confidence, full dynamic schema, per-object privacy, in-repo benchmark

With this release, everything on the original launch roadmap has shipped.

Compounding confidence, the full engine

A fact’s confidence now rises with independent corroboration and falls on contradiction, and contradicted facts are superseded, never silently overwritten.

Every relation sighting is recorded as evidence, one row per source document per edge; confidence is recomputed with a damped noisy-OR anchored on the strongest source.
On single-valued (“functional”) predicates like works-for, a confident conflicting observation closes the old edge, weakens it, and records the supersession in the claims ledger, history stays queryable.
brain_why gains independent-source counts, an audited confidence trend (e.g. “0.8 → 0.86”), and a superseded-relations list. brain_stats gains an evidence section. All additive, no tool contract changes.

Verified by npm run test:compounding, the full lifecycle end-to-end against a live database, plus 17 unit tests. Proven live with llama3.2:3b: two documents asserting different employers produce

Rhea Calloway -[works for]-> Halcyon Labs        conf=0.600  [SUPERSEDED]
Rhea Calloway -[works for]-> Driftwood Analytics conf=1.000  [ACTIVE]

Full dynamic schema, gated auto-promotion

The propose-and-surface loop from v1.1.0 now completes: proposals corroborated by enough distinct documents at high confidence auto-promote into the live catalogs, and the promoted type is immediately usable by the next extraction batch. Strictly opt-in (BRAIN_SCHEMA_AUTO_PROMOTE=1, defaults: 3 independent documents at ≥0.8 confidence), with a full audit trail on the proposal row, and strict curation mode always wins. Verified by npm run test:schema-promotion.

Per-object privacy

Documents marked private are readable only by the agent that created them (plus service-role callers) across all six read tools, brain_search, brain_context_pack, brain_recall_memory, brain_why, brain_neighbors, and brain_get_related. Private rows with no recorded creator stay hidden from non-service callers, conservative by design. Workspace, org, and public documents behave exactly as before. Verified by a two-agent visibility-matrix check, npm run test:sharing.

LongMemEval benchmark harness, in the repo

The headline number is reproducible by anyone, not asserted: 73.6% end-to-end QA accuracy on the complete 500-question LongMemEval oracle subset (no sampling) with 100% evidence-retrieval recall , reader gpt-4o-mini, judge gpt-4o. The harness ships in-repo at evals/longmemeval , self-contained, per-question workspace isolation, with methodology in its README. Run it yourself.

Run the checks yourself

npm run test:compounding, full corroboration → contradiction → supersession lifecycle, live DB, no LLM
npm run test:schema-promotion, default-off, gated auto-promotion, promoted-kind-becomes-usable
npm run test:sharing, two-agent visibility matrix for private documents
evals/longmemeval/, the benchmark harness itself, run the headline number yourself

Full details in the repository changelog. All changes are additive, no breaking changes to any brain_* tool contract.