Verification
Related but distinct issues:
Ultra Mode differs from these by providing a single-agent state machine with hardcoded phase transitions and tool-level enforcement, rather than multi-agent orchestration or plugin systems.
Problem
Currently, going from a requirement to a verified implementation requires manual coordination: switch to plan agent, write a plan, switch to build agent, implement, run tests, fix failures, repeat. This is tedious when you just want to "fire and forget" a well-defined task.
Proposal
Add a new primary agent called ultra that autonomously executes the full plan→build→verify→iterate loop with a hardcoded state machine that enforces correct phase transitions.
State Machine
planning → building → verifying → complete
↓ (test fail, retries < 10)
iterating → verifying
↓ (retries ≥ 10)
stop, ask user
Each phase restricts which tools are available:
| Phase |
Allowed |
Blocked |
| planning |
read, glob, grep, explore, write(plan file only) |
edit, write, bash(modify) |
| building |
all tools |
— |
| verifying |
read, glob, grep, ultra_verify |
edit, write, bash |
| iterating |
all tools |
ultra_verify |
| complete |
read only |
edit, write, bash |
Three enforcement layers
- Tool filtering —
resolveTools() physically removes blocked tools so the LLM cannot call them
- Execution guard —
ultra_verify rejects calls outside verifying phase; ultra_phase rejects invalid transitions
- Prompt injection —
insertReminders() injects phase-specific constraints every step
New tools
ultra_verify — auto-detects test command (package.json / Makefile / Cargo.toml / pyproject.toml / go.mod), runs tests, returns structured result. Auto-transitions to complete on pass or iterating on fail.
ultra_phase — transitions between phases, validates transitions are legal.
Multi-session support
Phase state is stored per sessionID (not per project directory), so multiple ultra sessions can run in parallel without interference.
Usage
@ultra implement user login with JWT auth and tests
Or set as default: { "default_agent": "ultra" }
Implementation
- New files:
src/session/ultra-state.ts, src/tool/ultra-verify.ts, src/tool/ultra-phase.ts, src/agent/prompt/ultra.txt
- Modified:
src/agent/agent.ts (register agent), src/tool/registry.ts (register tools), src/session/prompt.ts (tool filtering + reminders)
- Tests: 66 passing (22 state machine + 7 agent config + 37 existing agent tests unchanged)
Why a hardcoded state machine?
A prompt-only approach (just telling the LLM what to do) is unreliable — the LLM can skip phases, forget to verify, or claim completion without testing. The state machine makes the workflow deterministic: the LLM physically cannot call edit during planning, cannot skip verification, and must retry up to 10 times before giving up.
Verification
- Built and tested with
bun test — 66/66 tests pass
- Manually tested: created a personal website from scratch using ultra mode, full planning→building→verifying→complete cycle completed autonomously
Open to feedback on the design. Happy to adjust the approach (e.g., add a feature flag, use InstanceState instead of a plain Map, split into smaller PRs) based on maintainer preferences.
Verification
Related but distinct issues:
Ultra Mode differs from these by providing a single-agent state machine with hardcoded phase transitions and tool-level enforcement, rather than multi-agent orchestration or plugin systems.
Problem
Currently, going from a requirement to a verified implementation requires manual coordination: switch to plan agent, write a plan, switch to build agent, implement, run tests, fix failures, repeat. This is tedious when you just want to "fire and forget" a well-defined task.
Proposal
Add a new primary agent called ultra that autonomously executes the full plan→build→verify→iterate loop with a hardcoded state machine that enforces correct phase transitions.
State Machine
Each phase restricts which tools are available:
Three enforcement layers
resolveTools()physically removes blocked tools so the LLM cannot call themultra_verifyrejects calls outsideverifyingphase;ultra_phaserejects invalid transitionsinsertReminders()injects phase-specific constraints every stepNew tools
ultra_verify— auto-detects test command (package.json / Makefile / Cargo.toml / pyproject.toml / go.mod), runs tests, returns structured result. Auto-transitions tocompleteon pass oriteratingon fail.ultra_phase— transitions between phases, validates transitions are legal.Multi-session support
Phase state is stored per
sessionID(not per project directory), so multiple ultra sessions can run in parallel without interference.Usage
Or set as default:
{ "default_agent": "ultra" }Implementation
src/session/ultra-state.ts,src/tool/ultra-verify.ts,src/tool/ultra-phase.ts,src/agent/prompt/ultra.txtsrc/agent/agent.ts(register agent),src/tool/registry.ts(register tools),src/session/prompt.ts(tool filtering + reminders)Why a hardcoded state machine?
A prompt-only approach (just telling the LLM what to do) is unreliable — the LLM can skip phases, forget to verify, or claim completion without testing. The state machine makes the workflow deterministic: the LLM physically cannot call
editduring planning, cannot skip verification, and must retry up to 10 times before giving up.Verification
bun test— 66/66 tests passOpen to feedback on the design. Happy to adjust the approach (e.g., add a feature flag, use
InstanceStateinstead of a plain Map, split into smaller PRs) based on maintainer preferences.