Skip to content

feat: add session-scoped plan/build mode#3140

Open
trungutt wants to merge 1 commit into
docker:mainfrom
trungutt:plan-mode
Open

feat: add session-scoped plan/build mode#3140
trungutt wants to merge 1 commit into
docker:mainfrom
trungutt:plan-mode

Conversation

@trungutt

@trungutt trungutt commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Why

A planner sub-agent (e.g. the bundled coder agent's planner, pkg/config/builtin-agents/coder.yaml) puts planning under the agent's control. Plan mode puts it under the user's control: a session attribute the host (CLI, TUI, embedder) flips like a switch, and the model can't talk its way out of it.

That distinction matters because:

  • Works on any agent, third-party or hand-rolled — no YAML edit needed to add a planner sub-agent.
  • Hard tool removal, not prompt drift. coder's planner today still ships full filesystem (incl. write_file / edit_file); its read-only-ness is prompt-only. Plan mode strips every tool without Annotations.ReadOnlyHint from what the model sees.
  • Covers runtime-added MCP tools via the MCP-spec ReadOnlyHint annotation, with no YAML to update.
  • Mid-session toggle costs zero turns — one PATCH and the next turn observes it (vs. a transfer + synthesised user prompt for sub-agent handoff).
  • One primitive across hosts, instead of each host parsing agent topology or rolling its own deny-list.

Additive: planner sub-agents stay the right answer when planning belongs in the agent's design; plan mode is the user-driven counterpart.

What

New per-session Mode (build default, plan). In plan mode the runtime:

  1. Filters tools without Annotations.ReadOnlyHint from each turn's toolset (hard guarantee).
  2. Splices a per-turn <system-reminder> telling the model to plan, not act (so it doesn't waste turns trying tools that vanished).

API:

POST   /api/sessions               { ..., "mode": "plan" }
GET    /api/sessions/:id                                      # + snapshot
PATCH  /api/sessions/:id/mode      { "mode": "plan" }

PATCH /mode updates the live in-memory session if a runtime is attached and persists to the store, so the next turn picks it up directly. Persistence is a new mode TEXT column (migration 22); pre-existing rows normalize to build on read.

Filter piggybacks on the existing ExcludedTools filter site in runtime/loop.go; reminder rides the existing turn-start system-message splice.

Scope

In: runtime filter + reminder (incl. harness), API + DTOs, persistence + migration, unit + HTTP tests.

Out (left to hosts): TUI/leantui UX, CLI flags, "exit plan mode" tool — hosts can PATCH the endpoint directly.

Plan mode lets the runtime filter tools to read-only ones and inject a
per-turn system reminder, so the agent drafts a plan instead of taking
actions. Build mode is the default and preserves today's behaviour.

The mode is a per-session property, persisted alongside the rest of the
session, and exposed via:

  - POST   /api/sessions               { ..., mode }
  - GET    /api/sessions/:id           -> { ..., mode }
  - PATCH  /api/sessions/:id/mode      { mode }

Tool filtering is driven by the MCP-spec ReadOnlyHint annotation, so it
extends to user-added MCP tools without any per-tool config.
docker-agent

This comment was marked as low quality.

@aheritier aheritier added area/agent For work that has to do with the general agent loop/agentic features of the app area/api For features/issues/fixes related to the usage of the cagent API area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat: commit prefix) labels Jun 16, 2026
@rumpl

rumpl commented Jun 16, 2026

Copy link
Copy Markdown
Member

Question, why can't "plan mode" be implemented with a second agent that the user can select?

@trungutt

Copy link
Copy Markdown
Contributor Author

Question, why can't "plan mode" be implemented with a second agent that the user can select?

I updated the PR description

@rumpl

rumpl commented Jun 16, 2026

Copy link
Copy Markdown
Member

A planner sub-agent (e.g. the bundled coder agent's planner, pkg/config/builtin-agents/coder.yaml) puts planning under the agent's control.

This is not true, you can always ctrl+number to switch to an agent

@trungutt

Copy link
Copy Markdown
Contributor Author

@rumpl It could, but it scales poorly: every agent (built-in, community, user-authored) would need a hand-maintained "planner" twin kept in lockstep with its prompt, toolsets, hooks and sub-agents. Plan mode is effectively "every agent gets a free read-only twin", derived from each tool's MCP ReadOnlyHint — so user-added MCP servers and third-party agents work without anyone authoring a paired variant.

It also keeps you on the same agent under a different constraint, vs a sibling-agent switch (which has handoff cost: transfer event, possibly synthesised user prompt, separate conversation thread).

@trungutt trungutt requested a review from docker-agent June 16, 2026 13:29
@rumpl

rumpl commented Jun 16, 2026

Copy link
Copy Markdown
Member

Okay but... Not any agent can become a planner agent magically just by removing write tools. Its system prompt needs to be changed I would guess, also you are talking about ReadOnlyHint, but our builtin tools don't have it, do they?

@docker-agent

Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

@trungutt

trungutt commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Two points:

System prompt. It is changed per turn — planModeReminderMessages in pkg/runtime/plan_mode.go splices a <system-reminder> telling the model it is in plan mode, must not edit/shell/execute, and should draft a plan. Rides the same turn-start splice as turn_start hook output, so it picks up the cache_control marker too. That's the soft layer; the tool filter is the hard layer.

ReadOnlyHint on built-ins. They do have it — 33 occurrences across every read-side tool:

  • pkg/tools/builtin/filesystem/filesystem.go:425,449,462,475,487 — read_file, read_multiple_files, list_directory, directory_tree, search_files_content
  • pkg/tools/builtin/shell/shell.go:592,602 — list_background_jobs, view_background_job
  • pkg/tools/builtin/todo/todo.go:306–341 — all four todo tools
  • pkg/tools/builtin/memory/memory.go:152,175 — get_memories, search_memories
  • pkg/tools/builtin/tasks/tasks.go:575,607,617 — get_task, list_tasks, next_task
  • plus think.go:54, api.go:187, handoff.go:31, transfertask.go:34, mcpcatalog.go:423,435, modelpicker.go:74,84, rag.go:164, userprompt.go:65, skills.go:366,382, deferred.go:151,163, agent.go:480,490

Write-side tools (write_file, edit_file, shell, run_background_job, add_memory/update_memory/delete_memory, run_skill, change_model, lsp_rename/_code_actions/_format, run_background_agent) deliberately don't carry it — exactly what plan mode wants to strip. MCP tools get ReadOnlyHint from the spec itself, so user-added MCP servers and third-party tools are covered with no per-tool config.

On the sibling-agent alternative more broadly — beyond the points already in the description, it's painful operationally: every agent (built-in, community, user-authored) would need a hand-maintained planner twin kept in lockstep with its prompt, toolsets, hooks, and sub-agents. Two YAML surfaces per agent to keep in sync, every parent-agent change silently rots its planner sibling, and every third-party agent author has to do the same work. Plan mode collapses that to one knob across the fleet — including agents we don't author.

@trungutt trungutt marked this pull request as ready for review June 16, 2026 16:05
@trungutt trungutt requested a review from a team as a code owner June 16, 2026 16:05
@docker-agent

Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

@aheritier

Copy link
Copy Markdown
Contributor

It should cover most of #2788

@dgageot

dgageot commented Jun 19, 2026

Copy link
Copy Markdown
Member

Review — feat: add session-scoped plan/build mode

Overall this is clean and consistent with existing patterns — it builds, has good baseline test coverage, and the migration is backwards compatible. One blocking issue and a couple of concerns below.

🔴 Blocking: plan mode is bypassable via delegated sub-agents

Delegation tools (transfer_task, handoff) are marked read-only, so they survive plan-mode filtering. But delegated child sessions don't appear to inherit the parent's mode — they default back to build and get the full mutating toolset. So a plan-mode agent can't call write_file/shell directly, but it can delegate to a sub-agent that can, which undermines the core safety guarantee of plan mode.

Suggested fix: propagate the parent mode when creating sub-sessions (e.g. session.WithMode(parent.Mode)) in pkg/runtime/agent_delegation.go / pkg/runtime/loop.go, and add a regression test asserting that a plan-mode parent produces plan-mode children with mutating tools filtered out.

If sub-agents are intended to run in build mode, please document that explicitly and soften the reminder text so it doesn't claim state-changing tools are unavailable.

🟡 Harness agents get the reminder but not the hard filtering

In pkg/runtime/harness.go, plan mode injects the reminder but doesn't seem to apply the same tool filtering as the streaming loop. If the harness manages its own toolset, plan mode becomes prompt-only there, making "Only read-only tools have been made available" misleading. Either apply the filtering or adjust the text and document the limitation.

🟡 Possible data race on sess.Mode

UpdateSessionMode writes the mode under the manager mutex, but runtime reads for filtering/reminders don't appear to use the same lock. Consider guarding Mode (or exposing locked accessors) and running the relevant tests with -race.

⚪ Minor

  • Inconsistent invalid-mode handling: session creation normalizes invalid/empty mode → build, while the update endpoint rejects with 400. Prefer making these consistent (rejecting both).
  • Mid-turn mode changes apply on the next turn — matches the comments, not a bug, but worth documenting for API/UI callers.

Test coverage

Good for core plumbing (filtering, reminder, persistence, migration, HTTP, in-memory). Would like to see before merge: a delegation-inheritance test, and a harness-mode behavior test (or explicit documentation that plan mode is advisory there). An optional concurrency/race test around mode updates would be nice.

Recommendation

Request changes — the implementation is largely solid, but the delegated sub-agent bypass breaks the central guarantee of plan mode and should be fixed (or the guarantee redefined/documented) before merging. Looks close after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent For work that has to do with the general agent loop/agentic features of the app area/api For features/issues/fixes related to the usage of the cagent API area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants