Cybersecurity Calibration: Partner Brief

About the Cybersecurity Adapter

KAIROS Substrate is a deterministic Rust kernel that computes the structural margin of a system under load. The cybersecurity adapter routes that engine into a defended-zone evaluation surface. It reads normalized telemetry per zone, per tick, and returns a stability score: how much pressure a defended zone can still absorb before its weakest control collapses. The verdict is hash-bound to the calibration anchors and replay-deterministic across processes. Two operators replaying the same incident reach the same answer.

This is structural early warning, not pattern matching. It does not replace a SIEM, an EDR, or an XDR. It complements those tools with a reading they were not designed to produce: reachability of the safe set per defended zone. The Mythos-shaped sandbox-escape fixture in our smoke corpus walks the exact sequence: privilege pressure rises, segmentation collapses, exfiltration arrives. The zone reaches active intrusion before the exfiltration jump.

The public methodology debrief lives on the Spindle: Calibrating the Cybersecurity Adapter. Background framing for the structural-margin approach lives on the Cybersecurity Rosetta page.

Why Real-World Data Matters

The v1 synthetic baseline is calibrated against public references: Verizon DBIR, NIST SP 800-53 / 207, CIS Controls v8, OCSF 1.x, the Los Alamos Unified Host & Network corpus, and the DARPA Operationally Transparent Cyber dataset. The corpus is calibration-honest. Every cell carries a confidence tag, and roughly 46% of cells are tagged synthesised because no public reference exists at the resolution required.

That synthesised fraction is the load-bearing item we want to retire. Real telemetry from a design-partner environment lets us replace the synthetic policy-positive rate per zone per hour with one calibrated against a real enterprise baseline. The headline shifts from defensible-but-dismissible to “calibrated against partner tenants, with X% drift from public anchors.” That drift number is itself a research result, citable in workshop and conference write-ups.

The metric, plainly: the policy-positive rate per zone per hour is the count of windows the configured gate policy flags on reviewer-labeled confirmed-benign traffic, divided by the zone-hours observed. SOC teams reason about alarm budgets in this dimensional form. This is distinct from a true false-positive rate, which would require an independent ground-truth labelling pass — the rate we publish is what the policy does on traffic the partner has confirmed clean. A partner contribution closes the gap that currently sits between the public references and a real production environment.

What Partners Receive

A calibrated policy-positive rate per zone per hour measured against your own environment, with Wilson 95% confidence intervals per zone archetype.
Visibility into where your structural margin sits per defended zone, against a deterministic policy floor you control.
A right-to-recall mechanism. If an incident is later identified inside a contributed window, the aggregate is re-run and the correction is disclosed to you, and to any audience the original number reached.
Co-authorship credit on any public methodology output (workshop note, paper, blog post) where the partner contribution is load-bearing.
Early access to the cybersecurity adapter ahead of general pilot availability.

The Minimum-Viable Ask

In order:

30 to 90 days of OCSF export covering one or two zone archetypes (whichever you have telemetry depth in).
SOC-confirmed three-class labels on the contributed windows: confirmed-benign, suspected-benign-not-investigated, excluded-known-incident.
Incident disclosure for the contributed window, including minor and suspected events.
Permission to publish the aggregate policy-positive rate per zone per hour (not raw events) in pilot collateral.
Right-to-recall agreement as described above.

The sections below detail what each of those means in practice.

Scope of This Partnership

Three calibration experiments back the adapter pilot. Only one of them genuinely requires real-world data:

Benign-baseline calibration. The experiment that produces the headline policy-positive rate per zone per hour. This is the experiment a partner contribution unblocks.
Evasion sweep. Synthetic variants of credential-breakout (slow-exfil, session-splitting, decoy-traffic, control-cycling). Sufficient for v1 with an honest call-out. Real-world adversary telemetry, if a partner can offer it, raises evasion-sweep credibility but is not a blocker.
Sensitivity sweep. Programmatically varies one input metric at a time over its valid range against a frozen scenario. Stays synthetic by construction; real-world data adds no information at that grid layer.

The detail below is for the benign-baseline experiment.

What the Data Has to Feed

A CyberMetricSnapshot per zone, per tick, with all 16 CyberMetricComponents derivable from the source telemetry:

Side	Fields
λ · Attack-surface pressures (6)	`connectionRateAnomaly`, `authFailurePressure`, `privilegeEscalationPressure`, `exploitSophistication`, `lateralMovementRate`, `exfiltrationVelocity`
γ · Defense controls (6)	`defenseDepth`, `networkSegmentation`, `monitoringCoverage`, `patchCoverage`, `detectionLatencyMs`, `controlHealth`
Aggregates and metadata (4)	`attackSurfacePressure` (derivable as max-of-λ), `defensePosture` (derivable as min-of-γ), `sourceCount`, `confidence`

The harness owns the windowed aggregation that turns raw events into per-tick metric components. The mapping from raw-event-type → metric-component is the load-bearing calibration knob and is documented explicitly with the corpus on the partner side.

Coverage Matrix

Four zone archetypes, three benignness profiles per archetype:

Archetype	Source telemetry	Profiles
Identity plane	IdP and SSO logs (Okta, Azure AD, Auth0); AD security events 4624 / 4625 / 4768 / 4769 / 4776; MFA challenge logs; privilege-grant events; CASB	quiet; noisy-but-benign (onboarding/offboarding bursts, password-rotation cycles); near-miss (legitimate privilege-escalation review windows)
Edge device	EDR raw events (CrowdStrike, SentinelOne, Defender); DNS resolver logs; USB and peripheral events; outbound flow summaries; Zeek or Suricata where available	quiet; noisy-but-benign (patch-deploy windows, vuln-scan cycles); near-miss (sanctioned penetration tests, internal red-team)
Internal segment	NetFlow or IPFIX; east-west firewall deny logs; Zeek or Suricata; segmentation enforcement events	quiet; noisy-but-benign (backup windows, replication, deploy traffic); near-miss (legitimate cross-segment admin work)
Data plane	Cloud audit logs (CloudTrail, GCP audit, Azure activity); object-store access logs; DLP product events; sensitive-store query logs	quiet; noisy-but-benign (ETL windows, bulk export jobs); near-miss (analyst pulling sensitive datasets for sanctioned work)

A partner does not need to cover all four. Whichever one or two archetypes match your telemetry depth is the right contribution.

Time and Volume

Minimum: 30 days continuous per archetype.
Credible target: 60 to 90 days per archetype.
Why: the report characterises policy-positive rate with a confidence interval. Rare-but-real events (quarterly pen-tests, DR drills, end-of-month batch jobs, holiday traffic patterns) only show up across multiple weeks. A 7-day window can pass clean and silently overstate the rate claim.

Granularity

Per-tick aggregation at 60-second resolution is sufficient and matches the existing fixture cadence.
Source telemetry can be raw events; the harness performs windowed aggregation.
The raw-event-type → metric-component mapping must be documented and deterministic. The same input record set must produce byte-identical metric components on re-run.

Format

Strong preference: OCSF 1.x export (Open Cybersecurity Schema Framework). Already normalises across vendors and is becoming the de-facto standard for telemetry sharing under NDA.
Acceptable fallback: raw vendor JSON with documented schema. The translation cost is ours to absorb.

Bridge note on the ingest path. The shipping Rust adapter currently ingests an OCSF-lite projection — a documented subset of OCSF 1.x covering the fields the harness aggregates into the 16 CyberMetricComponents. A partner OCSF 1.x export is not consumed natively end-to-end; it is projected to OCSF-lite at ingest, either by us as part of the calibration replay or by an explicit translator we ship to the partner. The projection is one-way (1.x → lite) and documented per field, so the partner can audit which OCSF 1.x attributes are load-bearing for the published rate and which are dropped. Native OCSF 1.x ingest is a planned follow-on; it is not required for offline calibration.

Labeling

SOC analyst confirmation that contributed windows are clean. Three-class label per window:

confirmed-benign. Analyst reviewed and confirmed no incident.
suspected-benign-not-investigated. No alerts fired; not specifically reviewed.
excluded-known-incident. Known incident or under-investigation period; excluded from the corpus with timestamp ranges.

Without this label discipline, an undetected incident in the “benign” set silently invalidates the rate claim. The exclusion timestamps are the audit trail for the published number.

Redaction

Order-preserving hashes for entity identifiers (user IDs, hostnames, source IPs). Multi-event sequences must still link correctly within the contributed window.
Drop: free-text payloads, command lines (or hash them), document names, query strings, file paths.
Preserve: timestamps with timezone information. Timezone matters for noisy-but-benign windows such as backup cycles and business-hours patterns.
Aggregate-only publication: raw events stay private to the harness; only the aggregate rate number lands in pilot collateral.

We will share a redaction-tooling reference and a round-trip-fidelity test harness so the partner team can run the redaction before any data leaves the environment.

Provenance

Per-record source platform identifier (which EDR, which IdP).
Cross-tenant disclosure threshold for any published aggregate:
- v1 acceptable: at least one consenting partner per archetype. Partner is named only with explicit permission. No claim of k-anonymity is made at this threshold.
- v1.1 target: three or more partners per archetype, aggregated together (k ≥ 3) so no single org’s idiosyncrasies dominate the published rate and individual-tenant attribution is not recoverable from the aggregate.

Negative Controls

The partner discloses any incidents, even minor ones, that occurred in the contributed window so the windows can be excluded. A “we don’t think anything happened” assurance is not enough; absence of evidence-of-investigation is not evidence of benignness.

How a Partnership Unfolds

First conversation. We walk through the fit between your telemetry depth and the four archetypes. No data leaves your environment at this stage.
NDA. Standard mutual NDA. Redacted exports are the default; raw partner telemetry never leaves the partner environment in identifiable form.
Pilot scope. We agree the archetype, window length, label provenance, and right-to-recall mechanics together. The redaction harness ships before any data extraction.
Data extraction and round-trip. The partner team runs the redaction in their environment, validates the round-trip-fidelity test, then ships the redacted corpus.
Calibration run. We run the corpus through the harness and produce the per-archetype rate numbers with Wilson 95% confidence intervals.
Co-authored output. Joint review of the publishable aggregate, methodology notes, and any public output before release. Partner attribution per the agreed terms.

The shape of partner relationships that makes the work land is two complementary tenants: one identity-heavy (a tech company with strong IdP telemetry such as Okta or Azure AD), and one infrastructure-heavy (a regulated organisation with NetFlow and EDR depth, such as financial services or healthcare). Together those two cover three of the four archetypes. Data-plane coverage comes via either if the partner has cloud audit log depth.

Reach Out

If you are running infrastructure where this kind of contribution is feasible, we would value the conversation. Contact us and reference this brief in your message; we will respond directly with the redaction harness, the round-trip-fidelity test, and a draft mutual NDA.

Cybersecurity Calibration: Partner Brief

About the Cybersecurity Adapter

Why Real-World Data Matters

What Partners Receive

The Minimum-Viable Ask

Scope of This Partnership

What the Data Has to Feed

Coverage Matrix

Time and Volume

Granularity

Format

Labeling

Redaction

Provenance

Negative Controls

How a Partnership Unfolds

Reach Out

Privacy Policy

1. Data We Collect

2. How We Use Your Data

3. Cookies & Analytics

4. Data Storage & Security

5. Your Rights

6. Contact

Terms of Use

1. Acceptance

2. Intellectual Property

3. Early Access Program

4. Limitation of Liability

5. Simulation Outputs

6. Governing Law

7. Contact