Partner Brief

Cybersecurity Calibration: Partner Brief

What a design-partner contribution looks like, and what comes back

Scope
Cybersecurity adapter, real-world calibration
Updated
April 2026

About the Cybersecurity Adapter

KAIROS Substrate is a deterministic Rust kernel that computes the structural margin of a system under load. The cybersecurity adapter routes that engine into a defended-zone evaluation surface. It reads normalized telemetry per zone, per tick, and returns a stability score: how much pressure a defended zone can still absorb before its weakest control collapses. The verdict is hash-bound to the calibration anchors and replay-deterministic across processes. Two operators replaying the same incident reach the same answer.

This is structural early warning, not pattern matching. It does not replace a SIEM, an EDR, or an XDR. It complements those tools with a reading they were not designed to produce: reachability of the safe set per defended zone. The Mythos-shaped sandbox-escape fixture in our smoke corpus walks the exact sequence: privilege pressure rises, segmentation collapses, exfiltration arrives. The zone reaches active intrusion before the exfiltration jump.

The public methodology debrief lives on the Spindle: Calibrating the Cybersecurity Adapter. Background framing for the structural-margin approach lives on the Cybersecurity Rosetta page.

Why Real-World Data Matters

The v1 synthetic baseline is calibrated against public references: Verizon DBIR, NIST SP 800-53 / 207, CIS Controls v8, OCSF 1.x, the Los Alamos Unified Host & Network corpus, and the DARPA Operationally Transparent Cyber dataset. The corpus is calibration-honest. Every cell carries a confidence tag, and roughly 46% of cells are tagged synthesised because no public reference exists at the resolution required.

That synthesised fraction is the load-bearing item we want to retire. Real telemetry from a design-partner environment lets us replace the synthetic policy-positive rate per zone per hour with one calibrated against a real enterprise baseline. The headline shifts from defensible-but-dismissible to “calibrated against partner tenants, with X% drift from public anchors.” That drift number is itself a research result, citable in workshop and conference write-ups.

The metric, plainly: the policy-positive rate per zone per hour is the count of windows the configured gate policy flags on reviewer-labeled confirmed-benign traffic, divided by the zone-hours observed. SOC teams reason about alarm budgets in this dimensional form. This is distinct from a true false-positive rate, which would require an independent ground-truth labelling pass — the rate we publish is what the policy does on traffic the partner has confirmed clean. A partner contribution closes the gap that currently sits between the public references and a real production environment.

What Partners Receive

  • A calibrated policy-positive rate per zone per hour measured against your own environment, with Wilson 95% confidence intervals per zone archetype.
  • Visibility into where your structural margin sits per defended zone, against a deterministic policy floor you control.
  • A right-to-recall mechanism. If an incident is later identified inside a contributed window, the aggregate is re-run and the correction is disclosed to you, and to any audience the original number reached.
  • Co-authorship credit on any public methodology output (workshop note, paper, blog post) where the partner contribution is load-bearing.
  • Early access to the cybersecurity adapter ahead of general pilot availability.

The Minimum-Viable Ask

In order:

  1. 30 to 90 days of OCSF export covering one or two zone archetypes (whichever you have telemetry depth in).
  2. SOC-confirmed three-class labels on the contributed windows: confirmed-benign, suspected-benign-not-investigated, excluded-known-incident.
  3. Incident disclosure for the contributed window, including minor and suspected events.
  4. Permission to publish the aggregate policy-positive rate per zone per hour (not raw events) in pilot collateral.
  5. Right-to-recall agreement as described above.

The sections below detail what each of those means in practice.

Scope of This Partnership

Three calibration experiments back the adapter pilot. Only one of them genuinely requires real-world data:

  • Benign-baseline calibration. The experiment that produces the headline policy-positive rate per zone per hour. This is the experiment a partner contribution unblocks.
  • Evasion sweep. Synthetic variants of credential-breakout (slow-exfil, session-splitting, decoy-traffic, control-cycling). Sufficient for v1 with an honest call-out. Real-world adversary telemetry, if a partner can offer it, raises evasion-sweep credibility but is not a blocker.
  • Sensitivity sweep. Programmatically varies one input metric at a time over its valid range against a frozen scenario. Stays synthetic by construction; real-world data adds no information at that grid layer.

The detail below is for the benign-baseline experiment.

What the Data Has to Feed

A CyberMetricSnapshot per zone, per tick, with all 16 CyberMetricComponents derivable from the source telemetry:

SideFields
λ · Attack-surface pressures (6)connectionRateAnomaly, authFailurePressure, privilegeEscalationPressure, exploitSophistication, lateralMovementRate, exfiltrationVelocity
γ · Defense controls (6)defenseDepth, networkSegmentation, monitoringCoverage, patchCoverage, detectionLatencyMs, controlHealth
Aggregates and metadata (4)attackSurfacePressure (derivable as max-of-λ), defensePosture (derivable as min-of-γ), sourceCount, confidence

The harness owns the windowed aggregation that turns raw events into per-tick metric components. The mapping from raw-event-type → metric-component is the load-bearing calibration knob and is documented explicitly with the corpus on the partner side.

Coverage Matrix

Four zone archetypes, three benignness profiles per archetype:

ArchetypeSource telemetryProfiles
Identity planeIdP and SSO logs (Okta, Azure AD, Auth0); AD security events 4624 / 4625 / 4768 / 4769 / 4776; MFA challenge logs; privilege-grant events; CASBquiet; noisy-but-benign (onboarding/offboarding bursts, password-rotation cycles); near-miss (legitimate privilege-escalation review windows)
Edge deviceEDR raw events (CrowdStrike, SentinelOne, Defender); DNS resolver logs; USB and peripheral events; outbound flow summaries; Zeek or Suricata where availablequiet; noisy-but-benign (patch-deploy windows, vuln-scan cycles); near-miss (sanctioned penetration tests, internal red-team)
Internal segmentNetFlow or IPFIX; east-west firewall deny logs; Zeek or Suricata; segmentation enforcement eventsquiet; noisy-but-benign (backup windows, replication, deploy traffic); near-miss (legitimate cross-segment admin work)
Data planeCloud audit logs (CloudTrail, GCP audit, Azure activity); object-store access logs; DLP product events; sensitive-store query logsquiet; noisy-but-benign (ETL windows, bulk export jobs); near-miss (analyst pulling sensitive datasets for sanctioned work)

A partner does not need to cover all four. Whichever one or two archetypes match your telemetry depth is the right contribution.

Time and Volume

  • Minimum: 30 days continuous per archetype.
  • Credible target: 60 to 90 days per archetype.
  • Why: the report characterises policy-positive rate with a confidence interval. Rare-but-real events (quarterly pen-tests, DR drills, end-of-month batch jobs, holiday traffic patterns) only show up across multiple weeks. A 7-day window can pass clean and silently overstate the rate claim.

Granularity

  • Per-tick aggregation at 60-second resolution is sufficient and matches the existing fixture cadence.
  • Source telemetry can be raw events; the harness performs windowed aggregation.
  • The raw-event-type → metric-component mapping must be documented and deterministic. The same input record set must produce byte-identical metric components on re-run.

Format

  • Strong preference: OCSF 1.x export (Open Cybersecurity Schema Framework). Already normalises across vendors and is becoming the de-facto standard for telemetry sharing under NDA.
  • Acceptable fallback: raw vendor JSON with documented schema. The translation cost is ours to absorb.

Bridge note on the ingest path. The shipping Rust adapter currently ingests an OCSF-lite projection — a documented subset of OCSF 1.x covering the fields the harness aggregates into the 16 CyberMetricComponents. A partner OCSF 1.x export is not consumed natively end-to-end; it is projected to OCSF-lite at ingest, either by us as part of the calibration replay or by an explicit translator we ship to the partner. The projection is one-way (1.x → lite) and documented per field, so the partner can audit which OCSF 1.x attributes are load-bearing for the published rate and which are dropped. Native OCSF 1.x ingest is a planned follow-on; it is not required for offline calibration.

Labeling

SOC analyst confirmation that contributed windows are clean. Three-class label per window:

  • confirmed-benign. Analyst reviewed and confirmed no incident.
  • suspected-benign-not-investigated. No alerts fired; not specifically reviewed.
  • excluded-known-incident. Known incident or under-investigation period; excluded from the corpus with timestamp ranges.

Without this label discipline, an undetected incident in the “benign” set silently invalidates the rate claim. The exclusion timestamps are the audit trail for the published number.

Redaction

  • Order-preserving hashes for entity identifiers (user IDs, hostnames, source IPs). Multi-event sequences must still link correctly within the contributed window.
  • Drop: free-text payloads, command lines (or hash them), document names, query strings, file paths.
  • Preserve: timestamps with timezone information. Timezone matters for noisy-but-benign windows such as backup cycles and business-hours patterns.
  • Aggregate-only publication: raw events stay private to the harness; only the aggregate rate number lands in pilot collateral.

We will share a redaction-tooling reference and a round-trip-fidelity test harness so the partner team can run the redaction before any data leaves the environment.

Provenance

  • Per-record source platform identifier (which EDR, which IdP).
  • Cross-tenant disclosure threshold for any published aggregate:
    • v1 acceptable: at least one consenting partner per archetype. Partner is named only with explicit permission. No claim of k-anonymity is made at this threshold.
    • v1.1 target: three or more partners per archetype, aggregated together (k ≥ 3) so no single org’s idiosyncrasies dominate the published rate and individual-tenant attribution is not recoverable from the aggregate.

Negative Controls

The partner discloses any incidents, even minor ones, that occurred in the contributed window so the windows can be excluded. A “we don’t think anything happened” assurance is not enough; absence of evidence-of-investigation is not evidence of benignness.

How a Partnership Unfolds

  1. First conversation. We walk through the fit between your telemetry depth and the four archetypes. No data leaves your environment at this stage.
  2. NDA. Standard mutual NDA. Redacted exports are the default; raw partner telemetry never leaves the partner environment in identifiable form.
  3. Pilot scope. We agree the archetype, window length, label provenance, and right-to-recall mechanics together. The redaction harness ships before any data extraction.
  4. Data extraction and round-trip. The partner team runs the redaction in their environment, validates the round-trip-fidelity test, then ships the redacted corpus.
  5. Calibration run. We run the corpus through the harness and produce the per-archetype rate numbers with Wilson 95% confidence intervals.
  6. Co-authored output. Joint review of the publishable aggregate, methodology notes, and any public output before release. Partner attribution per the agreed terms.

The shape of partner relationships that makes the work land is two complementary tenants: one identity-heavy (a tech company with strong IdP telemetry such as Okta or Azure AD), and one infrastructure-heavy (a regulated organisation with NetFlow and EDR depth, such as financial services or healthcare). Together those two cover three of the four archetypes. Data-plane coverage comes via either if the partner has cloud audit log depth.

Reach Out

If you are running infrastructure where this kind of contribution is feasible, we would value the conversation. Contact us and reference this brief in your message; we will respond directly with the redaction harness, the round-trip-fidelity test, and a draft mutual NDA.

Privacy Policy

1. Data We Collect

When you sign up for early access or our newsletter, we collect your email address. We do not collect personal data beyond what you voluntarily provide.

2. How We Use Your Data

Your email is used solely to send product updates, early-access invitations, and research announcements from AnankeLabs. We do not sell, rent, or share your data with third parties.

3. Cookies & Analytics

This site does not use tracking cookies or third-party analytics. We may use server-side request logs for basic traffic monitoring.

4. Data Storage & Security

Submitted data is stored on secure, encrypted infrastructure. We retain your information only as long as necessary to provide the services you requested.

5. Your Rights

You may request deletion of your data at any time by contacting us. We will process deletion requests within 30 days.

6. Contact

For privacy inquiries, email [email protected].

Terms of Use

1. Acceptance

By accessing this site, you agree to these terms. If you do not agree, discontinue use immediately.

2. Intellectual Property

All content, software, research, and materials on this site are the property of AnankeLabs. The KAIROS engine, Rosetta adapter layer, Spindle simulation framework, and Serious Gaming SDK are proprietary technologies. No license is granted except as explicitly stated in a signed agreement.

3. Early Access Program

Early access is provided on an as-is basis. AnankeLabs reserves the right to modify, suspend, or terminate early access at any time without notice.

4. Limitation of Liability

AnankeLabs provides this site and its materials "as is" without warranty of any kind. We are not liable for any damages arising from your use of this site or reliance on its content.

5. Simulation Outputs

KAIROS simulation outputs are analytical tools, not predictions. They should not be used as the sole basis for financial, military, policy, or safety-critical decisions.

6. Governing Law

These terms are governed by the laws of Sweden.

7. Contact

For legal inquiries, email [email protected].