Rosetta//AI Safety

Rosetta AI Safety Adapter

Compiled Physics for the AI Stack

The AI Safety adapter translates the universal stability equation into the forces governing frontier model deployments. Lambda (Λ) transforms into Model Capability, representing the optimizer’s drive toward its objective. Gamma (Γ) becomes Alignment Constraint, serving as the structural buffer that absorbs this drive.

This translation converts abstract alignment requirements into deterministic physics. The adapter establishes the definition of “safe” for AI systems in the same language that Substrate can enforce, enabling the implementation of gate chains, policy layers, and audit trails tailored to the specific dynamics of autonomous agents.

Guardrails Are Not Governance

Frontier AI deployments rely on prompt engineering, RLHF, and output filters to manage risk. These are behavioral suggestions. They fail under distributional shift and remain vulnerable to jailbreaks.

Unguarded deployments execute risky tool calls and exhaust compute budgets on unsafe actions. No structural floor exists to prevent collapse. Systems rely entirely on statistical expectation.

Kairos Substrate closes this gap with compiled physics. The engine evaluates every action against a deterministic stability equation before execution. It is a law of the deployment.

A Stability Equation, Not a Classifier

Substrate models agent interaction as a dynamical system governed by two forces.
Lambda (Λ) represents agency, mapping the optimizer's force against constraints.
Gamma (Γ) represents structural constraint, providing the stabilizing buffer.

The engine computes a stability score at every tick. The system intervenes when 𝒮 drops below a defined threshold. This intervention follows the demands of the stability equation rather than the output of a classifier.

S =
ΓA + ΓB
ΛA + ΛB

Equation: The stability score

The system intervenes when the score drops below the threshold.

Deterministic

Identical inputs produce identical outputs (ϵ = 10-6).

Model-agnostic

The engine evaluates actions independently of model architecture.

Zero-dependency

The self-contained Rust binary requires no API calls or network access.

Memory-safe

The core engine contains zero unsafe blocks.

Three Gates. Zero Gaps.

Every proposed action passes through a layered gate chain. Any gate will reject an action that violates structural integrity.

01

State Gate

Evaluates structural health before the engine considers an action. If gamma (Γ) falls below the deployment floor, the engine rejects all actions. This mechanism remains immune to prompt sensitivity.

02

Action Gate

Previews proposed actions against the reachability field. Safe tools map to stabilizing directions. Risky tools move toward the repulsor boundary and trigger a rejection.

03

Hazard Gate

Detects basin collapse and multi-agent paradoxes. These are hard stops. The physics of the system permit no operator override or retry budget.

Intervention That Learns, Then Escalates

Substrate manages the state following a rejection. The system applies proportional intervention based on calculated risk.

01

Reformulation

When gamma headroom is moderate (≥ 0.1), Substrate signals the model to attempt a different approach. Empirical testing shows models find safe paths 100% of the time when guided by this signal.

02

Budget Depletion

Rejected actions consume a retry budget. Novelty scoring penalizes repetitive, low-effort attempts. Stall detection terminates oscillation loops to preserve compute resources.

03

Human Escalation

Substrate routes to a human operator when the budget is exhausted or gamma drops below 0.1. The model cannot proceed. The decision requires human judgment.

Cryptographic Operator Authority

The handoff to a human operator is cryptographically secured. This ensures every override is tamper-proof and auditable.

  1. Substrate signals HUMAN_ESCALATION and halts.
  2. The operator reviews the rejection context and stability state.
  3. Authorization requires an RSA-PSS signed override token.
  4. The token binds to the specific evaluation request via SHA-256 digest.
  5. The system verifies the token and records the action in a durable audit trail.

The system fails closed. If the coordinator is unreachable, the engine blocks the action.

Two-Layer Policy Architecture

A dual-layer system separates platform authority from operator customization.

Base Policy

The platform provider sets the structural floor. This policy defines minimum gamma thresholds and enforcement modes. It is signed with RSA-PSS and remains immutable for downstream operators.

Operator Overrides

Operators tighten policy within base layer bounds. They will raise the gamma floor or restrict enforcement modes. They cannot lower safety thresholds.

Enforcement Modes

Observe
Evaluates actions without rejection for baselining.
State Gate
Rejects if Γ falls below the floor.
State + Action Gate
Full preview of actions against the reachability field.

Tested Against Real Models

Validation involves frontier LLMs executing tool-use tasks in sandboxed environments. The data reflects mechanical reality.

Boundary Study v1 Results

Risky tool rejection
100% (48/48)
State gate rejection
100% (20/20)
Safe task completion
100% (20/20)
False negatives
0
False positives
0

Resource Efficiency

KAIROS-enabled runs consume fewer tokens than unguarded baselines. The engine removes waste by terminating unsafe paths early. State-gate tasks terminate in 2.4 seconds.

Read the Boundary Study →

One Engine. Four Surfaces.

The Rust codebase compiles to four specific deployment targets.

Native Library

Embeds into hypervisors and robotics controllers via C FFI.

CLI Binary

Provides trace analysis and policy linting for CI/CD pipelines.

Python SDK

Offers direct access to evaluation via PyO3 bindings.

WASM Module

Enables browser-based advisory evaluations and visualizations.

Use Cases for AI Organizations

Substrate provides structural guarantees across diverse deployment environments.

  • Frontier Deployment: Blocks destructive tool calls before they reach execution.
  • Multi-Agent Governance: Prevents resource depletion by modeling collective agency against shared constraints.
  • Certification: Generates deterministic, reproducible evidence for regulatory compliance.
  • Pre-Deployment CI/CD: Gates releases on physics-verified safety metrics.
  • Cost Optimization: Reduces token waste by cutting unsafe execution paths at the root.
Read the Fly-by-Wire Documentation

Not Another Guardrail

Substrate provides a structural guarantee that persists when behavioral training fails.

Approach Mechanism Deterministic? Bypassable?
PromptingInstructionNoYes
RLHFPreferenceNoYes
SubstratePhysicsYesNo

RLHF trains the pilot. Substrate defines the flight envelope. Both are necessary, but only Substrate prevents the airframe from exceeding structural limits.

Technical Specifications

Engine

Language
Rust (Stable)
Latency
Sub-millisecond
Determinism
ϵ = 10-6

Security & Safety

Security
RSA-PSS Signing
Safety
Zero unsafe in core
Dependencies
Zero external

Request Early Access to KAIROS

KAIROS Substrate is shipping to design partners ahead of general availability. Active pilots: the cybersecurity adapter (redacted telemetry) and the AI safety adapter (agent trajectories) — see the partner briefs for what a contribution looks like and what comes back.

Compliance and regulatory teams, agent-eval researchers, and investors are also welcome to reach out. Submit your details or use the Contact tab.

Request received. We'll be in touch.

Privacy Policy

1. Data We Collect

When you sign up for early access or our newsletter, we collect your email address. We do not collect personal data beyond what you voluntarily provide.

2. How We Use Your Data

Your email is used solely to send product updates, early-access invitations, and research announcements from AnankeLabs. We do not sell, rent, or share your data with third parties.

3. Cookies & Analytics

This site does not use tracking cookies or third-party analytics. We may use server-side request logs for basic traffic monitoring.

4. Data Storage & Security

Submitted data is stored on secure, encrypted infrastructure. We retain your information only as long as necessary to provide the services you requested.

5. Your Rights

You may request deletion of your data at any time by contacting us. We will process deletion requests within 30 days.

6. Contact

For privacy inquiries, email [email protected].

Terms of Use

1. Acceptance

By accessing this site, you agree to these terms. If you do not agree, discontinue use immediately.

2. Intellectual Property

All content, software, research, and materials on this site are the property of AnankeLabs. The KAIROS engine, Rosetta adapter layer, Spindle simulation framework, and Serious Gaming SDK are proprietary technologies. No license is granted except as explicitly stated in a signed agreement.

3. Early Access Program

Early access is provided on an as-is basis. AnankeLabs reserves the right to modify, suspend, or terminate early access at any time without notice.

4. Limitation of Liability

AnankeLabs provides this site and its materials "as is" without warranty of any kind. We are not liable for any damages arising from your use of this site or reliance on its content.

5. Simulation Outputs

KAIROS simulation outputs are analytical tools, not predictions. They should not be used as the sole basis for financial, military, policy, or safety-critical decisions.

6. Governing Law

These terms are governed by the laws of Sweden.

7. Contact

For legal inquiries, email [email protected].