Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/api/primitive-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ Import from `@tangle-network/agent-runtime/intelligence` — 60 exports.

### Recursive atom + loop kernel (alias of ./runtime)

Import from `@tangle-network/agent-runtime/loops` — 379 exports.
Import from `@tangle-network/agent-runtime/loops` — 381 exports.

| Symbol | Kind | Summary |
|---|---|---|
Expand Down Expand Up @@ -651,6 +651,7 @@ Import from `@tangle-network/agent-runtime/loops` — 379 exports.
| `SupervisorOpts` | interface | _(no summary — add a TSDoc line at the declaration)_ |
| `SupervisorProfile` | interface | The supervisor's profile — the subset of an `AgentProfile` that selects + shapes its brain. |
| `SurfaceScore` | interface | _(no summary — add a TSDoc line at the declaration)_ |
| `ToolLoopCompaction` | interface | Self-compaction — bound the loop's OWN context window the way a fresh-respawn (dumb-Ralph) loop |
| `ToolSpec` | interface | _(no summary — add a TSDoc line at the declaration)_ |
| `TraceSource` | interface | _(no summary — add a TSDoc line at the declaration)_ |
| `TrajectoryAnalysis` | interface | _(no summary — add a TSDoc line at the declaration)_ |
Expand Down Expand Up @@ -712,6 +713,7 @@ Import from `@tangle-network/agent-runtime/loops` — 379 exports.
| `SteeringDecision` | type | Terminal-or-continue decision shared by all three steering drivers. The |
| `SupervisedResult` | type | Typed terminal result (M2) — a no-winner is NEVER coerced to a best-effort output. |
| `ToolLoopChat` | type | One inference turn over the running conversation + the tool specs → the model's text, any |
| `ToolLoopCompactionOptions` | type | Public supervisor-facing compaction config: same knobs as the primitive, but `distill` is optional |
| `TrajectoryReportFn` | type | `trajectoryReport(...)` — the tree+cost reconstructor. Async (reads journal + optionally blobs). |
| `UsageEvent` | type | Normalized usage event — the single channel every executor reports through, so the |
| `Verify` | type | `verify(spec)` — build the 2-node implement→verifier-gate combinator. |
Expand Down
226 changes: 190 additions & 36 deletions docs/api/runtime.md

Large diffs are not rendered by default.

5 changes: 3 additions & 2 deletions src/runtime/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -482,8 +482,9 @@ export {
worktreeFanout,
} from './supervise/worktree-fanout'
// The driver-brain seam type a consumer scripts (a mock) or passes (`routerBrain`) into
// `DriverAgentOptions.brain` — the canonical one-inference-turn tool-loop chat.
export type { ToolLoopChat } from './tool-loop'
// `DriverAgentOptions.brain` — the canonical one-inference-turn tool-loop chat. `ToolLoopCompaction`
// is the self-compaction config that bounds the brain's own context window (the supervisor chapter-close).
export type { ToolLoopChat, ToolLoopCompaction, ToolLoopCompactionOptions } from './tool-loop'
export type {
AgentRunSpec,
DefaultVerdict,
Expand Down
107 changes: 100 additions & 7 deletions src/runtime/supervise/coordination-driver.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,16 @@ import {
coordinationVerbNames,
createCoordinationTools,
type MakeWorkerAgent,
type SettledWorker,
} from '../../mcp/tools/coordination'
import type { ToolSpec } from '../router-client'
import { runBrainLoop, type ToolLoopChat } from '../tool-loop'
import type { Agent, Budget, ResultBlobStore, Scope, Spend } from './types'
import {
runBrainLoop,
type ToolLoopChat,
type ToolLoopCompaction,
type ToolLoopCompactionOptions,
} from '../tool-loop'
import type { Agent, Budget, NodeSnapshot, ResultBlobStore, Scope, Spend, TreeView } from './types'

export interface DriverAgentOptions {
readonly name: string
Expand Down Expand Up @@ -77,6 +83,44 @@ export interface DriverAgentOptions {
/** Injected clock for the in-loop absolute-deadline guard — keeps the deadline check
* deterministic in tests. Defaults to `Date.now`. */
readonly now?: () => number
/** Give the driver brain a chapter-lifecycle on its OWN context window. The LLM-brain front doors
* lose to a dumb-Ralph respawn because the brain re-bills its whole coordination transcript every
* turn — the same context overflow a single steered agent suffers, one level up. With this set,
* once the brain's running conversation exceeds `thresholdTokens` it distills the accumulated
* history to a compact progress note and continues fresh: the supervisor analog of respawning
* against external tracking state, except the live `Scope` roster IS the durable state. Default
* off (no behavior change). `distill` defaults to a self-summary authored by the brain combined
* with the factual settled-worker roster; override to supply your own. */
readonly compaction?: ToolLoopCompactionOptions
}

/** The default chapter-close prompt: the brain summarizes its OWN progress for its future self before
* the detailed history is dropped. Emphasis on PENDING work — the part a too-eager chapter-close
* loses (the coding-burn counter-finding: closing after one fix leaves integration bugs uncircled). */
const distillInstruction =
'CONTEXT COMPACTION. Your detailed turn-by-turn history is about to be discarded to free your context window. Write a COMPLETE, compact handoff note for your future self so you can keep going without it. Cover: (1) what you have accomplished; (2) every worker you spawned and its current status/result; (3) what subtasks remain unfinished, failing, or unverified — be specific and exhaustive here, this is the part you must not lose; (4) your immediate next action. Do not call any tools; respond with the note only.'

/** Factual ground truth for the digest — the live worker roster from Scope plus the delivered-result
* ledger, independent of whatever the brain's prose summary captures. */
function summarizeRoster(view: TreeView, settled: ReadonlyArray<SettledWorker>): string {
if (view.nodes.length === 0) return 'Workers in current live scope: none yet.'
const settledById = new Map(settled.map((w) => [w.id, w]))
const lines = view.nodes.map((node) => formatRosterNode(node, settledById.get(node.id)))
return `Workers in current live scope (ground truth from the run, ${view.nodes.length} total, ${view.inFlight} in flight):\n${lines.join('\n')}`
}

function formatRosterNode(node: NodeSnapshot, settled?: SettledWorker): string {
const result =
settled?.status === 'done'
? `, delivered=${settled.valid ?? false}${
settled.score !== undefined ? `, score=${settled.score}` : ''
}${settled.outRef ? `, outRef=${settled.outRef}` : ''}`
: settled?.status === 'down'
? `, reason=${settled.reason ?? 'unknown'}`
: node.outRef
? `, outRef=${node.outRef}`
: ''
return `- ${node.id}: ${node.status}, label=${node.label}, runtime=${node.runtime}${result}`
}

/** maxTurns=0 anti-runaway tripwire: a finite ceiling for the ONE case the conserved pool can't
Expand Down Expand Up @@ -176,8 +220,12 @@ export function driverAgent(opts: DriverAgentOptions): Agent<unknown, unknown> {
// drains the pool → poolStarved). Wrapping the brain keeps the debit exactly where it was; a
// scripted/mock turn reports no usage and meters nothing, so offline equal-k stays exact.
// iterations:0 — the conserved iteration channel budgets CHILD rounds, not driver turns.
let turn = 0
const chat: ToolLoopChat = async (messages, tools) => {
let driverTurn = 0
const meteredBrain = async (
messages: ReadonlyArray<Record<string, unknown>>,
tools: ReadonlyArray<ToolSpec>,
detail: Record<string, unknown>,
) => {
const res = await opts.brain(messages, tools)
if (res.usage || res.costUsd !== undefined) {
const turnSpend: Spend = {
Expand All @@ -187,19 +235,60 @@ export function driverAgent(opts: DriverAgentOptions): Agent<unknown, unknown> {
ms: 0,
}
await scope.meter(turnSpend, {
kind: 'driver-inference',
driver: opts.name,
turn,
toolCalls: (res.toolCalls ?? []).map((c) => c.name),
...detail,
})
}
turn += 1
return res
}
const chat: ToolLoopChat = async (messages, tools) => {
const turn = driverTurn
const res = await meteredBrain(messages, tools, {
kind: 'driver-inference',
turn,
})
driverTurn += 1
return res
}

// Chapter-close on the brain's own window. The default distiller pairs the factual settled-worker
// roster (from the live scope) with a brain-authored progress note; the brain call runs through
// the metered `chat`, so the one-time O(history) distill cost debits the conserved pool like any
// turn. It replaces the per-turn O(history) re-billing it removes.
const compaction: ToolLoopCompaction | undefined = opts.compaction
? {
thresholdTokens: opts.compaction.thresholdTokens,
distill:
opts.compaction.distill ??
(async (msgs) => {
const roster = summarizeRoster(scope.view, coord.settled())
try {
const res = await meteredBrain(
[...msgs, { role: 'user', content: distillInstruction }],
[],
{ kind: 'driver-compaction', compactingTurn: driverTurn },
)
const narrative = (res.content ?? '').trim()
return narrative ? `${roster}\n\n## Progress notes\n${narrative}` : roster
} catch (e) {
return `${roster}\n\n## Progress notes\nSummary unavailable: ${errMessage(e)}`
}
}),
...(opts.compaction.onCompact ? { onCompact: opts.compaction.onCompact } : {}),
...(opts.compaction.preserveHead !== undefined
? { preserveHead: opts.compaction.preserveHead }
: {}),
...(opts.compaction.estimateTokens
? { estimateTokens: opts.compaction.estimateTokens }
: {}),
}
: undefined

await runBrainLoop({
chat,
tools: toolSpecs,
...(compaction ? { compaction } : {}),
execute: async (name, args) => {
// WORK FIRST: a work tool the driver runs itself (act). A non-null return is handled here;
// null/undefined means "not mine" → fall through to the coordination dispatch (spawn/await/…).
Expand Down Expand Up @@ -295,3 +384,7 @@ function safeJson(v: unknown): string {
return String(v)
}
}

function errMessage(e: unknown): string {
return e instanceof Error ? e.message : String(e)
}
9 changes: 8 additions & 1 deletion src/runtime/supervise/supervise.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import type { AgentProfile } from '@tangle-network/sandbox'
import { ValidationError } from '../../errors'
import type { MakeWorkerAgent } from '../../mcp/tools/coordination'
import type { RouterConfig } from '../router-client'
import type { ToolLoopChat } from '../tool-loop'
import type { ToolLoopChat, ToolLoopCompactionOptions } from '../tool-loop'
import { type DeliverableSpec, gateOnDeliverable } from './completion-gate'
import { assertModelAllowed } from './model-policy'
import { createInMemoryRunContext } from './run-context'
Expand Down Expand Up @@ -83,6 +83,12 @@ export interface SuperviseOptions {
readonly blobs?: ResultBlobStore
readonly maxDepth?: number
readonly maxTurns?: number
/** Give the supervisor brain a chapter-lifecycle on its OWN context window (router arm only): once
* its coordination transcript exceeds `thresholdTokens` it distills to a compact progress note and
* continues, instead of re-billing the whole transcript every turn (the cost that makes the LLM-brain
* front door lose to a dumb-Ralph respawn). The live `Scope` roster is the durable state across
* chapters. Default off. `distill` defaults to a brain self-summary + the settled-worker roster. */
readonly compaction?: ToolLoopCompactionOptions
readonly runId?: string
readonly now?: () => number
/** Restrict the run to this subset of models. When set, every configured model — the
Expand Down Expand Up @@ -135,6 +141,7 @@ export function supervise(profile: SupervisorProfile, task: unknown, opts: Super
...(opts.extraTools ? { extraTools: opts.extraTools } : {}),
...(opts.executeExtraTool ? { executeExtraTool: opts.executeExtraTool } : {}),
...(opts.maxTurns !== undefined ? { maxTurns: opts.maxTurns } : {}),
...(opts.compaction ? { compaction: opts.compaction } : {}),
})

return createSupervisor<unknown, unknown>().run(agent, task, {
Expand Down
13 changes: 12 additions & 1 deletion src/runtime/supervise/supervisor-agent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
import { ValidationError } from '../../errors'
import type { MakeWorkerAgent } from '../../mcp/tools/coordination'
import { type RouterConfig, routerBrain } from '../router-client'
import type { ToolLoopChat } from '../tool-loop'
import type { ToolLoopChat, ToolLoopCompactionOptions } from '../tool-loop'
import { driverAgent, finalizeBestDelivered } from './coordination-driver'
import { serveCoordinationMcp } from './coordination-mcp'
import type { Agent, Budget, ResultBlobStore, Scope } from './types'
Expand Down Expand Up @@ -95,6 +95,10 @@ export interface SupervisorAgentDeps {
args: Record<string, unknown>,
) => Promise<string | null | undefined>
readonly maxTurns?: number
/** Give the supervisor brain a chapter-lifecycle on its OWN context window (router arm only) — it
* distills its coordination transcript to a compact progress note once it exceeds the threshold,
* instead of re-billing the whole thing every turn. See `DriverAgentOptions.compaction`. */
readonly compaction?: ToolLoopCompactionOptions
}

export function supervisorAgent(
Expand All @@ -105,6 +109,12 @@ export function supervisorAgent(
const systemPrompt = profile.systemPrompt ?? defaultSupervisorPrompt
const harness = profile.harness ?? null

if (harness !== null && deps.compaction) {
throw new ValidationError(
'supervisorAgent: compaction is only supported for router-brained supervisors (profile.harness null)',
)
}

if (harness === null) {
// ROUTER arm: the in-process tool-loop. `routerBrain` is now an internal detail — the caller
// passes a profile, not a hand-built brain (a test may still inject `deps.brain`).
Expand All @@ -120,6 +130,7 @@ export function supervisorAgent(
...(deps.extraTools ? { extraTools: deps.extraTools } : {}),
...(deps.executeExtraTool ? { executeExtraTool: deps.executeExtraTool } : {}),
...(deps.maxTurns !== undefined ? { maxTurns: deps.maxTurns } : {}),
...(deps.compaction ? { compaction: deps.compaction } : {}),
})
}

Expand Down
Loading
Loading