feat(extension): context usage donut and compaction#4283
Conversation
- kilo-api-fixture: add contextLength to MockGatewayModel, include context_length in mocked model item when set; guard tool_choice/ system_environment assertions for summarization calls (tools: []) - context-usage.test.ts: donut shows "850 / 1,000 tokens (85%)" after reply with usage chunk; auto-compaction fires at ≥85% threshold; manual "Compact now" compacts seeded conversation - conversation-drafts.test.ts: draft A / draft B persist across tab switches - conversation-tabs.test.ts: remove dialog expectation on tab close; assert no dialog fires; remove dialog handlers from history-delete and history-reuse tests that used the old close-confirm flow - firefox-selenium-e2e.ts: close-without-confirm — no alert expected
…paction invariant
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (4 files)
Fix these issues in Kilo Cloud Previous Review Summaries (5 snapshots, latest commit 68a0c95)Current summary above is authoritative. Previous snapshots are kept for context only. Previous review (commit 68a0c95)Status: 1 Issue Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (3 files)
Fix these issues in Kilo Cloud Previous review (commit f1ad780)Status: 1 Issue Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (4 files)
Fix these issues in Kilo Cloud Previous review (commit 6233aac)Status: 1 Issue Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (1 files)
Fix these issues in Kilo Cloud Previous review (commit 15b017c)Status: No Issues Found | Recommendation: Merge Files Reviewed (3 files)
Previous review (commit da2a010)Status: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (21 files)
Reviewed by gpt-5.4-20260305 · Input: 86K · Output: 12.7K · Cached: 660.2K Review guidance: REVIEW.md from base branch |
The eval/debugger message path gated senders on tab === undefined to tell the side panel apart from content scripts. When the side panel runs as a tab (as the e2e harness loads it) sender.tab is defined, so the background never answered and tab listing/tools failed. Gate on the sender's extension origin instead: a content script shares the extension id but reports the host page's origin, while any extension page reports chrome-extension://<id>. This is a stronger trust signal and lets the e2e suite drive the real message path.
Assert auto-compaction against persisted storage instead of the virtualized conversation list (rows unmount while events are rewritten), and give the manual-compaction turn a sub-threshold usage chunk so the donut enables Compact now.
Fixes a pre-existing capitalized-comments lint error on the wrapped comment.
Adds a folder-gated workflow (apps/extension/**) covering the gaps the global CI misses for the extension: the extension's own type-aware lint, the Firefox MV3 build, and the Chrome Playwright e2e suite (which also builds Chrome). Typecheck/format/unit also run here via the package's verify script for a self-contained signal. Firefox Selenium e2e is excluded (unreliable headless).
The "Compact now" button was enabled whenever usage was non-zero, but manual compaction kept the last KEEP_RECENT_EXCHANGES (2) exchanges like auto-compaction, so a conversation with <=2 user messages had nothing to summarize and the click was a silent no-op. With a 1M-context frontier model auto-compaction never fires, so this was the normal manual path. Manual compaction now keeps only the latest exchange (KEEP_RECENT_EXCHANGES_MANUAL=1), and the button gates on whether there is summarizable history rather than on measured usage — so it is never enabled-but-inert and also works after a reload, when in-memory usage has reset. Auto-compaction is unchanged. Verified with a live local-backend e2e (fl@fl.fl, kilo-auto/frontier): a two-exchange conversation now compacts on click.
Drop the manual keep-one-exchange threshold and its disabled single-exchange edge case. Manual "Compact now" now keeps no recent exchange (keep 0): it summarizes the entire conversation and is enabled whenever there is anything to summarize (at least one user message). Auto-compaction still keeps KEEP_RECENT_EXCHANGES for continuity near the context limit.
- Compaction transcript now includes tool inputs (eval code, query/element) and result payloads (snapshot text, eval return, errors), each truncated to 2000 chars, so the summary keeps facts that previously lived only in tool results. - extension-ci.yml also runs on .nvmrc and pnpm-workspace.yaml changes, which alter the runtime and workspace/catalog graph the jobs build against. - Clarify the manual-compaction e2e comment: the button is enabled by summarizable history, not by the donut's usage value.
| // The result payload (snapshot text, eval return, element details) is often the only record. | ||
| return event.value === undefined | ||
| ? 'Tool result (ok)' | ||
| : `Tool result (ok): ${truncateToolText(stringifyToolValue(event.value))}`; |
There was a problem hiding this comment.
WARNING: Screenshot payloads now consume the compaction transcript
get_viewport_screenshot stores a raw dataUrl in event.value, but this path serializes every successful tool result directly instead of collapsing screenshots to the same note/mediaType placeholder used by buildGatewayMessagesFromEvents(). After any screenshot turn, compaction will spend about 2000 characters of the summarization prompt on PNG base64 rather than the page facts the next turn needs.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
What
Adds context-usage visibility and context compaction to the browser-agent side panel, plus two small conversation-UX fixes.
Features
usage.prompt_tokens/ modelcontext_length. Hover/click opens a popover with the exact token counts and a Compact now button. Green < 70%, amber 70–90%, red ≥ 90%; grey when the model reports no context length.🗜️ Compacted earlier contextassistant message, keeping the last 2 exchanges verbatim.Small fixes
How it works
stream_options: { include_usage: true }and parses the trailingusagechunk intoKiloGatewayChatCompletion.usage;context_lengthis parsed from/models.onUsagecallback (runLlmTurn→ turn runners → chat panel).isCurrentRun()token, so parallel conversations don't interfere.tools: [];splitEventsForCompactionalways cuts on a user-message boundary so no tool-call/tool-result pair is orphaned.Incidental fix: side-panel message-path trust check
While adding e2e coverage I found the background's
isTrustedExtensionSendergated senders onsender.tab === undefined. That holds for the real side panel, but no message-flow e2e test could ever pass because the harness loads the side panel as a tab (sosender.tabis defined) and the background silently never responded — every message-sending test failed at "No tab selected". Changed the gate to trust the sender's extension origin (chrome-extension://<id>) instead: a content script shares the extension id but reports the host page's origin, so this is a stronger boundary and also lets the e2e suite drive the real message path. This unblocked the entire pre-existing message-flow e2e suite (safe mode, eval, run abort, etc.).Testing
context_lengthparsing, usage forwarding, context-usage helpers,splitEventsForCompaction/transcript rendering.e2e:chrome/e2e:firefox; the extension'sverifyis typecheck+lint+format+unit). These were run locally.