HestiaOS

Public Evidence

Evidence & Benchmarks

Each evidence item includes what it demonstrates — and, critically, its limitations. No claims are made beyond what is publicly verifiable.

CEG Benchmark Mini-Run

Public Summary

2026-05-31

Governed mode produced replayable decision evidence for selected benchmark tasks.

Demonstrates

  • auditability
  • duplicate handling
  • stale-context handling
  • governance blocking
  • replayable traces

Limitations

  • mini-run only
  • not a general safety guarantee
  • public summary redacted

Links

Execution Commit Gate Validation

Synthetic

2026-06-01

Side effects are blocked when ExecutionCommit is missing from the decision trace.

Demonstrates

  • execution boundary enforcement
  • missing ExecutionCommit detection
  • trace integrity

Limitations

  • synthetic test scenario
  • production conditions may differ
  • does not cover all side effect types

Links

Kernel Structural Invariant Checks

Public Summary

2026-06-02

Critical kernel tests pass for DTO contracts, policy authority path, and structural invariants.

Demonstrates

  • DTO contract stability
  • single policy authority path
  • structural invariants separated from policy

Limitations

  • kernel v0.1 freeze candidate only
  • not all invariants formalized yet
  • public summary only

Links

Duplicate Proposal Handling

Public Summary

2026-06-03

Duplicate intent proposals are detected and prevented from creating duplicate side effects.

Demonstrates

  • idempotency guard
  • intent deduplication
  • audit trail consistency

Limitations

  • content-based dedup only
  • not timing-attack resistant
  • public summary

Links