Forensic Copilot & Counterfactual Replay

Two features, designed against each other and documented together: they are the killer demo of the security-wedge roadmap. The Copilot's most valuable action is running counterfactual replays.

Walkthrough — Forensic Copilot & Counterfactual Replay

The investigative question

A market-defining forensics product needs an answer to a question observability and gateways cannot give:

What would have happened if X had been different at the moment of the incident?

That is the counterfactual — and it is what auditors, CISOs, and AI governance leads actually ask.

Re-execute a captured trace with one variable changed, deterministically, and produce a signed diff bundle. Changeable variables (v1): model, model version, system prompt, guardrail state, RAG context, tool registry, sampling params. It returns a signed replay event in the ledger plus a diff bundle (token-level diff, cost delta, IOC delta, narrative summary).

Honest about determinism

This is a forensics product — it cannot overclaim what replay proves. Bit-exact replay is only possible for seed-pinned, self-hosted models. Best-effort replays run N times, report the distribution, and are explicitly marked non-deterministic — in the workbench UI and the exported bundle. A non-deterministic replay is still evidence; it just is not bit-exact, and the bundle must say so.

Forensic Copilot

An agent that runs investigations on the chain-of-custody ledger — grounded in the IOC library, the case model, and TFEF semantics, not a general LLM with a system prompt. It can triage, generate hypotheses, orchestrate counterfactual replays, draft evidence, and match IOCs across history.

It explicitly cannot close cases autonomously, modify ledger events, or speak outside its evidence. Every Copilot suggestion is itself appended to the ledger as a case_action event — a reviewer can replay the investigation the Copilot ran.

Why this isn't 'ChatGPT for logs'

The action layer is forensic primitives — run_replay, match_ioc, query_chain, draft_evidence — not execute_python. Every action is a chain-of-custody event, and the reasoning is auditable.

Last updated 14 May 2026