Skip to content

feat(evidence): export policy-mining session rows#3

Merged
drewstone merged 3 commits into
mainfrom
feat/policy-evidence-export
Jun 26, 2026
Merged

feat(evidence): export policy-mining session rows#3
drewstone merged 3 commits into
mainfrom
feat/policy-evidence-export

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

What changed

  • Adds traces evidence, a CLI export that writes one normalized JSONL row per real agent session.
  • Adds SDK exports: buildPolicyEvidenceRecord, collectPolicyEvidence, serialization, and file writing helpers.
  • Stamps repo/git labels, span/tool/token summaries, stuck-loop signals, and explicit provenance: notCampaignCell: true.
  • Documents the boundary: traces exports session evidence for downstream mining; it does not claim benchmark campaign wins.

Why

This makes /home/drew/code/traces useful to agent-lab without pretending it owns the experiment runner. The PR shows the adapter layer I meant: trace sessions become compact policy-mining evidence rows, then agent-lab can cluster/validate policies separately.

Verification

  • git merge-tree --write-tree origin/main HEAD => clean merge tree
  • pnpm typecheck
  • pnpm test => 10 files / 67 tests passed
  • pnpm build
  • Live smoke: pnpm dev evidence --harness codex --last 1 --out /tmp/traces-policy-evidence.jsonl --otlp /tmp/traces-policy-evidence.otlp.jsonl wrote 1 row from 38,085 spans with repo labels, tool summaries, 708 stuck-loop findings capped to 25 examples, and notCampaignCell: true.

Included commits

This branch also includes the two local commits already ahead of origin/main: repo/git grouping labels and the agent-eval dependency bump.

@drewstone drewstone merged commit e5d75bf into main Jun 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant