Pillar 1 — Tamper-evident capture

The foundational capability pillar. It hardens the scaffold-era forensic ledger substrate into a per-tenant, signed, tamper-evident substrate, and ships the SDKs that feed it.

Walkthrough — Pillar 1 — Tamper-evident capture

What it is

Cryptographically signed, chain-of-custody token-level capture across model providers (OpenAI, Anthropic, then Bedrock / Vertex / Azure / open-weight) and agent frameworks (LangChain, LlamaIndex, AutoGen, Pydantic AI, raw SDK).

  • Surfaces — a capture SDK (TypeScript first, then Python), a capture-ingest API endpoint, and the hardened ledger schema + append path.

How it works — the split hash

Integrity needs a single serialization point; authenticity needs the signing key in the customer's process. A single hash cannot do both, so the event hash is split:

  • contentHashsha256(JCS(event content)), producer-signed by the SDK with the tenant's key (ed25519, BYOK).
  • chainHashsha256(prevHash ‖ contentHash), computed by the server inside the append transaction.
Neither party can corrupt the record alone

The signature proves content authenticity; the chain hash proves chain position. The customer cannot reorder history; the vendor cannot forge content.

The SDK ships events ship-and-forget — under a 5ms overhead target, an async queue, a bounded offline buffer. The pillar-1-capture migration truncates the scaffold-era chain (a restart, not a backfill — so no tamper-evidence property is ever fabricated); it is destructive and needs explicit approval.

Capturing with the SDK

There are two capture SDKs — TypeScript (sdk/typescript/, @tokenforensics/capture) and Python (sdk/python/, tokenforensics). Each wraps an OpenAI or Anthropic client in one line; every call is captured, with no change to how you use the client:

// TypeScript
import { instrument } from "@tokenforensics/capture";
import OpenAI from "openai";

const openai = instrument(new OpenAI(), {
  apiKey: process.env.TF_API_KEY!,        // transport auth
  tenantId: process.env.TF_TENANT_ID!,
  signingKey: loadPrivateKeyPem(),        // BYOK — never leaves the process
  signingKeyId: process.env.TF_SIGNING_KEY_ID!,
  endpoint: "https://tokenforensics.ai/api/ingest",
});
# Python
from tokenforensics import instrument, InstrumentConfig
from openai import OpenAI

client = instrument(OpenAI(), InstrumentConfig(
    api_key=os.environ["TF_API_KEY"],
    tenant_id=os.environ["TF_TENANT_ID"],
    signing_key=load_private_key_pem(),   # BYOK — never leaves the process
    signing_key_id=os.environ["TF_SIGNING_KEY_ID"],
    endpoint="https://tokenforensics.ai/api/ingest",
))
Walkthrough — TypeScript capture SDK — instrument, call, sealed event
Walkthrough — Python capture SDK — same model, same wire, same hashes
The wrapped call pays almost nothing

The hot path only builds a plain event object and enqueues it — capture adds <5ms p99 to the call. JCS canonicalization, ed25519 signing, batching, and the network ship all run on an async worker. A network outage buffers events to a bounded file and replays them in order on reconnect; the producer-generated eventId makes every re-send a server-side no-op. Capture is fire-and-forget — it never throws into, alters, or measurably delays the wrapped call. Streamed responses are teed and roll up into one event.

Both SDKs hash identically — proven, not promised

The Python SDK's JCS canonicalizer and hash helpers are verified against the same sdk/conformance/corpus.json the TypeScript reference produced — it reproduces every contentHash and chainHash byte-for-byte. That is the first real cross-language proof: a mixed TypeScript + Python fleet builds one consistent, verifiable chain.

Sending an event — POST /api/ingest

The SDK wraps this, but the wire contract is plain HTTP — a signed event (or an { events: [...] } batch) POSTed with the tenant's API key:

curl -X POST https://tokenforensics.ai/api/ingest \
  -H "Authorization: Bearer tf_live_…" \
  -H "Content-Type: application/json" \
  -d '{
    "eventId": "evt_…",
    "eventType": "LLM_COMPLETION",
    "occurredAt": "2026-05-14T00:00:00.000Z",
    "payload": { "model": "claude-opus-4-7", "tokens": 128 },
    "contentHash": "…",
    "signature": "…",
    "signingKeyId": "sk_…"
  }'

The API key authenticates transport only — the tenantId is taken from the key, never the body, so a key can only append to its own tenant's chain. The server recomputes contentHash from the event content and rejects a mismatch: that catches a tampered payload and cross-language hash drift at ingest time rather than in a customer's audit. A batch ingests in one all-or-nothing transaction. The response returns the sealed { eventId, chainSeq, chainHash } per event; re-POSTing the same eventId is a no-op that returns the already-sealed link, so offline-buffer replay and retries are safe.

Walkthrough — POST /api/ingest — the wire contract for capture

Proving the chain holds — the conformance suite

A forensic chain is only as trustworthy as the proof that every implementation builds it identically. sdk/conformance/ is that proof: a frozen, language-neutral corpus.json — event contents with their expected contentHash, a hash-linked chain, and a signature vector — that every TFEF implementation must reproduce byte-for-byte.

Cross-language hash equality, proven not promised

The TypeScript conformance suite asserts the corpus; the Python SDK loads the same corpus.json and reproduces every hash byte-for-byte. A one-byte disagreement in canonical JSON between two implementations would break chain verification silently across a mixed-language fleet — the test that catches it runs in CI on every PR.

All seven assertion classes are wired and green. Four pure-function classes — cross-language hash equality, chain integrity, signature validity, tamper detection — assert against corpus.json in both the TypeScript suite and the Python suite. Three DB-bound classes — tenant isolation, idempotency, ordering — run against a Postgres-service test database in CI's db-tests job. Together they are the contract every TFEF implementation must satisfy.

The conformance suite is intentionally separate from the substrate's own DB tests (tests/ledger.db.test.ts, which covers extra substrate-specific behaviour like batch all-or-nothing and exact error shapes); shared fixture helpers in tests/helpers/append-fixtures.ts keep "what a valid signed event looks like" defined exactly once. .github/workflows/ci.yml is the project's first continuous-integration pipeline.

Walkthrough — The TFEF conformance suite — byte-identical hashes across languages

BYOK signing keys — registration and rotation

Forensic capture is only as defensible as the chain-of-custody around the signing key. Pillar 1's BYOK flow is the substrate side of that story: the customer generates the keypair offline, registers only the public half, and signs every capture event inside their own process. The platform never holds the private key — so the customer can prove, to an auditor, that nothing they did not authorize could have been written to their chain.

Two surfaces:

# Register a public key (per-tenant; bearer auth → tenantId).
POST /api/signing-keys
{ "publicKey": "<pem>", "algorithm": "ed25519", "mode": "byok" }
→ 201 { signingKeyId, tenantId, publicKey, algorithm, mode, createdAt }

# List the tenant's keys — active first, then rotated.
GET  /api/signing-keys
→ 200 { keys: [{ signingKeyId, publicKey, mode, createdAt, rotatedAt }, …] }
Rotation is itself a signed forensic event

The rotation endpoint accepts a SIGNATURE_ROTATION event signed by the outgoing key and runs three writes in one transaction: append the event to the chain, set rotatedAt on the outgoing SigningKey, and insert the replacement. If the outgoing key's signature does not verify — or it is already rotated — everything rolls back. The chain contains a tamper-evident proof of every rotation that ever happened, authorized by the previous key all the way back to genesis.

# Rotate. The body is the signed event the SDK would otherwise POST
# to /api/ingest — same shape, eventType = SIGNATURE_ROTATION.
POST /api/signing-keys/<OLD_KEY_ID>/rotate
{ "eventId": "evt_rot_…", "eventType": "SIGNATURE_ROTATION",
  "occurredAt": "...", "payload": { "newPublicKey": "...", "newMode": "byok" },
  "contentHash": "...", "signature": "<signed by OLD key>",
  "signingKeyId": "<OLD_KEY_ID>" }
→ 200 { rotationEvent: { eventId, chainSeq, chainHash },
        deprecatedKey: { id, rotatedAt },
        newKey: { signingKeyId, publicKey, mode, createdAt, … } }

The substrate logic lives in src/lib/signing-keys.ts (registerSigningKey / rotateSigningKey / listSigningKeys); the rotation transaction composes with the ledger via the exported appendWithinTransaction so the event-append and the key-row writes land atomically.

Walkthrough — BYOK — register and rotate a signing key

Build status

The substrate core, the capture-ingest endpoint, the conformance suite skeleton + CI, and both capture SDKs (TypeScript + Python) are built: the pillar-1-capture schema migration (Tenant / ApiKey / SigningKey + the per-tenant signed-chain TokenAuditEvent), the RFC 8785 JCS canonicalizer, the content/chain hash helpers, the ed25519 sign/verify primitives, the rewritten signed appendAuditEvent (signature verification, per-tenant advisory locking, idempotent dedup), POST /api/ingest with bearer API-key auth, batch support, and a per-key rate limit, the sdk/conformance/ corpus + suite wired into CI, and the sdk/typescript/ and sdk/python/ capture SDKs (instrument() for OpenAI + Anthropic, async ship-and-forget worker, offline buffer, streaming roll-up — the Python SDK reproduces the shared conformance corpus byte-for-byte). The full seven-class conformance suite is now green — four pure-function classes (cross-language hash equality, chain integrity, signature validity, tamper detection) plus three DB-bound (tenant isolation, idempotency, ordering). BYOK signing-key registration and the signed rotation flow are built — POST /api/signing-keys, GET /api/signing-keys, POST /api/signing-keys/{id}/rotate, with the rotation event appended to the chain through the same substrate writer as capture. Still ahead: framework adapters (LangChain, LlamaIndex, AutoGen, Pydantic AI) and the capture workbench UI.

Why it matters

It is the precondition for everything else. An engineer or security analyst gets token-level capture they can prove was not altered after the fact. Without forensic-grade capture, nothing downstream — attribution, incident replay, compliance evidence — holds up in an audit.

Last updated