Bhavya U
bhavyau@microsoft.com
90d · built 2026-05-28
90-day totals
- Commits
- 123
- Grow
- 10.6
- Maintenance
- 6.3
- Fixes
- 4.7
- Total ETV
- 21.6
Where this dev ranks
Percentile against the global top-100 leaderboard (all-time totals).
- By commits
- Top 48 %
- By Growth share
- Top 10 %
30-day trajectory
Last 30 days vs. the 30 days before. Up arrows on Growth and ETV mean improvement; up arrow on Fixes share means more time on fixes (worse).
Daily performance
Daily ETV, stacked by Growth, Maintenance and Fixes.
Work-mix over time
Share of Growth / Maintenance / Fixes over a rolling 7-day window. Reads as 'where is effort flowing right now'.
Bug flow over time
Monthly bug flow attributed to this developer. The left bar (red) is bug impact this dev authored that was addressed in the given month — combining bugs others fixed for them and bugs they fixed themselves. The right bar is fixes they personally shipped that month, split between self-fixes (overlap with the red bar) and fixes done for someone else. X-axis is fix-time, not introduction-time — the Navigara API attributes bugs backward to the author at the moment the fix lands.
- Self-fix share
- 48%
- Bugs you introduced
- 6.0
- Bugs you fixed
- 8.2
Repository spread
Where this developer's commits land. Concentrated work (top1 > 80%) vs polymath spread (top1 < 30%).
Most impactful commits
Top 20 by ETV in the 90-day window.
- 2.8ETVAdd Cache Explorer view to chat debug panel (#313620) * Add Cache Explorer view to chat debug panel Add a new "Cache Explorer" entry under "Explore Trace Data" in the chat debug overview. The view helps diagnose prompt-cache misses by diffing two model-turn requests side by side. The pure diff engine (chatDebugCacheDiff.ts) parses the input messages JSON exposed via IChatDebugEventModelTurnContent.sections, normalizes each message to {role, name, text, byteLength}, and produces a per- position signature of the prompt prefix. The first position whose role, length, or content diverges is reported as the cache break — anything after that point cannot be served from the prompt cache. The view (chatDebugCacheExplorerView.ts) lays out a left rail of model turns annotated with cache hit %, A/B summary cards, the prompt signature with the break marker, and a Components accordion that diffs the system prompt and any divergent messages. Sequential pairing is the default (B = current selection, A = previous turn); click in the rail to set B and shift-click to set A. The diff engine ships with 10 unit tests in chatDebugCacheDiff.test.ts. * Cache Explorer iteration: rail groups, OTel-backed metrics, prompt signature bars Iterate on the Cache Explorer view added earlier on this branch: - Left rail groups model turns by parent request and shows the user prompt as the group header. Group rows are collapsible and the full request id is shown in the header. - Each rail row reports agent source, cache hit %, duration, and time for the turn; rows with hit < 90% render the chip in red. - Single-selection model: clicking a row sets it as the current request and the row above is implicitly the previous one to diff against. - Producer plumbing: the file logger now persists copilot_chat.debug_name and gen_ai.response.id alongside the model-turn entry, and the modelTurn content carries a requestId. The summary card surfaces the full network requestId so it can be copied. - Replaced the chip-style prompt signature with a horizontal role- colored bar visualization showing both requests on a shared scale, with a vertical break marker at the divergence index. - Cache performance card replaces the pill row with a structured layout: cache hit headline + token reuse, where the cache broke + estimated lost tokens, and a one-line diff summary. - Component diff and signature lanes use Previous/Current labels instead of A/B. Refs https://github.com/microsoft/vscode/pull/313608 * Cache Explorer: char-level inline diff in Components accordion Replace the plain-text body of each Components row with a side-by-side line + character diff rendered directly into HTML. Uses the existing linesDiffComputers.getDefault().computeDiff() that Monaco's diff editor also uses internally; ignoreTrimWhitespace stays off so cache-relevant whitespace is visible. - Each line is emitted as a div with one of three classes `context`, `add`, `remove` for full-line styling. - Inner range mappings produce char-level <span> highlights inside added or removed lines. - Multi-line inner range mappings are skipped for v1; the surrounding add/remove styling already conveys the change. - Bounded by maxComputationTimeMs=200 so a stray giant tool-result diff cannot stall the renderer. No widget, no editor instance, no layout calls; replaces the existing two raw <div> bodies with a directly-styled HTML diff. Refs https://github.com/microsoft/vscode/pull/313620 * Cache Explorer: extract text from tool_call_response and tool_call parts The OTel input messages format wraps tool I/O as part-level objects, not as top-level text: - A user/tool message that returns tool output uses { type: 'tool_call_response', id, response: '...' } - An assistant message that invokes a tool uses { type: 'tool_call', id, name, arguments: {...} } Until now parseInputMessages only counted parts with type === 'text', so these messages showed up as zero-byte slots in the diff with both sides labeled '(not present)' \u2014 confusing because tool I/O is the single most cache-relevant content in an agentic loop. This change pulls the response payload out of tool_call_response (and the tool name + arguments out of tool_call) and includes them in the normalized text we diff against. We also reclassify the row's display role to 'tool' when the message is dominated by a tool result so the rail / signature / accordion label it consistently. Two new unit tests pin the extraction behaviour. Refs https://github.com/microsoft/vscode/pull/313620 * Cache Explorer: track request options + likely cache expiration Prompt caches invalidate on more than just message-array changes \u2014 flipping tool_choice, raising reasoning_effort, switching to Claude extended thinking, or changing the response_format all bust the cache even when the prompt prefix is byte-identical. Surface those changes. Producer: - New OTel attribute copilot_chat.request.options carrying a curated subset of the request body. Captures tool_choice, reasoning, reasoning_effort, thinking, thinking_budget, output_config, response_format, text, truncation, context_management, the various penalties, store, stream, stream_options, prediction, seed, parallel_tool_calls, service_tier, metadata, verbosity, snippy, state, intent, intent_threshold, include, plus an 'extra' catch-all for any unrecognised top-level fields. - Persisted onto llm_request entries in the file logger so the data survives session reloads. Consumer plumbing: - New requestOptions?: string on IChatDebugEventModelTurnContent and the matching DTO + ext-host class + proposed API. Read on both the live OTel span path and the on-disk entry path. View: - New 'Request Options' table renders every captured option with Previous and Current columns; rows whose values differ are highlighted with the diff-removed background. The model id is layered on top of the request_options blob so model swaps show up in the same table. - An inline 'Options changed: ...' banner sits below the summary cards so the user spots option drift without scrolling. - Cache performance card now detects the 'likely cache expiration' case: when the model reports 0% hit, the structural diff finds no prefix break, AND the option table is identical, the headline switches to '\u2014 likely cache expiration' with an explanation. When options are the only thing that changed, the break line says so explicitly. Refs https://github.com/microsoft/vscode/pull/313620 * Cache Explorer: address Copilot review nits from #313608 + #313602 Six small follow-ups: - Switch truthy checks to '!== undefined' for token fields in chatDebugFlowGraph.ts (model-turn tooltip) so a turn with 0 input or output tokens still gets a tooltip line. - Same fix for the modelTurn aria label in chatDebugLogsView.ts \u2014 a 0-token turn now still announces 'Model turn: <model> 0 tokens' instead of dropping the count. - Add the cached-tokens row to the modelTurn branch of formatEventDetail in chatDebugEventDetailRenderer.ts (regressed during a recent merge) and add the cachedTokens field to the existing 'modelTurn - with all fields' unit test. - chatDebugFlowGraph tooltip also gains a 'Cached tokens: N' line when present. - Restore the requestName deserialize in ExtHostChatDebug._deserialize Event \u2014 the serializer sends it but the round trip was dropping it. Add the corresponding requestName field to the ChatDebugModelTurnEvent ext-host class so the assignment compiles. * Cache Explorer: address Copilot review on #313620 Five fixes from Copilot's review: - Rename INormalizedMessage.byteLength to charLength (text.length is UTF-16 code units, not bytes), and update all UI labels from 'B' to 'chars' so the displayed unit matches what we actually measure. Touches the diff engine, the explorer view, and the unit tests. - setSession now clears collapsedGroups and resets openComponents to the default expanded set, mirroring how Flow Chart resets its collapse state on session change. Prevents unbounded growth and cross-session collapse-state leaks. - Rail rows are now keyboard accessible: each row is focusable (tabIndex=0), exposes role='button', aria-selected, and aria-label, and responds to Enter/Space. Adds a focus-visible outline. - render() now uses a monotonically-increasing renderToken captured at the start of each call and re-checked after each await; an older render whose model-turn resolves come back late will no longer write into a DOM the newer render has already rebuilt. - _reviveResolvedContent in mainThreadChatDebug now passes through maxInputTokens and maxOutputTokens, which were silently dropped. Refs https://github.com/microsoft/vscode/pull/313620 * Cache Explorer: address Councillor-Opus follow-up nits Five fixes prompted by the council review: - breakBytePos used to fall through to 'cumulative' (the right edge of the bar) when the diff's break index was outside the side's segment list \u2014 it now returns undefined, which the renderer already handles as 'no break marker for this side'. Prevents a logic mismatch between the diff and the segment list from being silently masked as a misleading 'cache broke at the end' marker. - pickCacheRelevantRequestOptions drops the 'extra' catch-all. We now only forward an explicit allowlist of cache-keying body fields to OTel and the on-disk debug log. Keeps any future provider- specific body fields (auth tokens, API keys, personalization) from leaking through; new cache knobs must be added explicitly. - Replace the local JSON-stringify based deepEqual helper in the view with the equals function from vs/base/common/objects, which is already used elsewhere in the workbench for value comparisons. - Add a fast-fail comment to messagesEqual explaining why charLength stays even though it is implied by text equality. - Document the trailing-context loop in renderInlineDiff and the silent selectedIndex clamp on session change so future readers don't think they're bugs. Expand the isLikelyCacheExpiration JSDoc to enumerate other invalidation causes the heuristic cannot distinguish. * Cache Explorer: clarify stableStringify fallback intent Document why stableStringify falls back to String(value) (circular refs / BigInt) and why the diff engine deliberately does not take an ILogService dependency to log such cases. The fallback produces a stable but lossy representation that still surfaces as content drift in the UI, so the failure mode is visible rather than silent. * Cache Explorer: address Copilot review nits round 2 Four small fixes from the latest Copilot review pass: - Update parseInputMessages JSDoc: was still describing charLength as 'byte length' even though the field was renamed. - Update diffPromptSignature comment: it claimed every position from the divergence onward is reported as non-identical, but the algorithm classifies each position independently. The first divergence is what breaks the cache; later identical positions are reported truthfully and the UI keys off the first break index. - Rename the 'bytes' field on the local renderSignature segment type to 'chars' (and the breakBytePos helper to breakCharPos) so the source code matches what the user-visible labels already say. - Drop the dead 'tools' entry from the openComponents Set seed in setSession and the field initializer; the diff pipeline only emits 'system' and 'messages[N]' component names, so the 'tools' entry never matched and had no visible effect.github.com-microsoft-vscode · 8c4048e0 · 2026-05-01
- 1.1ETVCache Explorer: handle Responses API continuations and tool definitions (#314654) Surfaces tool definitions and Responses API continuation deltas in the Agent Debug Logs Cache Explorer so users can reason about cache hit/miss behavior on requests that use previous_response_id. OTel pipeline: - Capture tool definitions per chat span as gen_ai.tool.definitions. - Add copilot_chat.request.shape attribute carrying sanitized request shape metadata (api type, hasPreviousResponseId, input item types). No IDs or content are captured. - Persist both attributes through the file logger and surface them in resolved model-turn debug content for live and replayed sessions. Provider message normalization: - Treat tool_search_output as a distinct tool_search role with a tool_search_output part instead of dropping it through the generic fallback (which produced {role: undefined, parts: []}). - Add normalization for function_call, function_call_output, and tool_search_output Responses API item types. - Preserve absent-vs-empty tools distinction on tool_search_output so cache-key-relevant byte-level differences are not flattened. Cache Explorer UI: - Add tools (catalog) component diff alongside system. - Add a tool search color/legend role distinct from tool results, with a hyphenated CSS class for consistency. - Re-frame the prompt signature in prefix order (system, tools, messages) so a tools-catalog change is no longer misreported as break at messages[0]. - For Responses API continuation requests, suppress positional message diffing against the previous request: the wire delta is asymmetric with the previous full input, so positional diffs are misleading. Render the current continuation delta as its own component, label the comparison as Visible Request Signature, and skip the cache- expiration heuristic since the full provider-reconstructed prompt cannot be inferred from the wire delta. - Strip multiple leading system messages on dedup, harden prefix component insertion against double-insertion, and guard against malformed inputItemTypes metadata. Tests: - Added Vitest coverage for new Responses API normalizers and the absent-vs-empty tools distinction. - Added cache diff parser tests for tool_search_output messages.github.com-microsoft-vscode · a1c2e116 · 2026-05-06
- 0.9ETVAdd support for TST for openai models @bhavyaus (#311841) * Add support for TST for openai models * Refactor tool search handling in rawMessagesToResponseAPI and add test for tool_search history conversion Co-authored-by: Copilot <copilot@github.com> * Refactor tool search logic for OpenAI models and enhance related tests Co-authored-by: Copilot <copilot@github.com> * Add optimized tool search instructions and integrate into Gpt54Prompt * Update extensions/copilot/src/platform/endpoint/node/responsesApi.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Address Copilot review: fix circular dep, model/family mismatch, and any cast * Fix package.json tags for toolSearchTool setting --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>github.com-microsoft-vscode · a14d0306 · 2026-04-22
- 0.7ETVInline summarization: summarize within the agent loop for maximum prompt cache hits (#4956) * Add inline summarization feature for agent conversation history - Introduced configuration option for inline summarization in package.json and configurationService.ts. - Updated agentIntent.ts to handle inline summarization logic during conversation. - Modified summarizedConversationHistory.tsx to support inline summarization instructions. - Enhanced tests to cover inline summarization scenarios and extraction of inline summaries. * Remove cache-friendly summarization prompt and related configurations * Refactor inline summarization handling in ToolCallingLoop and add summary application method * Add failure telemetry, deferred cleanup, and debugName tracking for inline summarization * Address PR review: fix empty string check, telemetry counts, cache token reporting, and test naminggithub.com-microsoft-vscode · c7c0fac6 · 2026-04-03
- 0.6ETVFix stale background compaction across model switches and /compact (#317163) The `_backgroundSummarizers` map on `AgentIntent` is keyed only by sessionId, so a summary kicked off against one model's prefix could be applied unconditionally on the next render — even after the user switched to a model with a larger context window or ran `/compact`. The user saw a 'Compacted conversation' notice on a turn with plenty of headroom, with content summarized against the old model's history. * `BackgroundSummarizer` now records the `endpointModel` it was built for. * `AgentIntent.getOrCreateBackgroundSummarizer` cancels and recreates the summarizer if the endpoint identity changed since last call. * `handleSummarizeCommand` (`/compact`) cancels any pending background summarizer once we commit to foreground compaction. * Pre-render apply now gates on `contextRatio >= applyMinRatio` (0.65) as defense-in-depth — covers context-size overrides and any path that slips past the endpoint check. Stale completed results are consumed-and-discarded so a fresh kick-off can replace them.github.com-microsoft-vscode · 15bd7994 · 2026-05-23
- 0.6ETVRemove server-side tool search and consolidate on client-side tool_search (#310343) - Remove all server-side tool_search_tool_regex types, handlers, and stream processing from messagesApi.ts - Remove isAnthropicToolSearchEnabled/isAnthropicCustomToolSearchEnabled functions - Remove AnthropicToolSearchEnabled and AnthropicToolSearchMode settings - Remove TOOL_SEARCH_TOOL_NAME, TOOL_SEARCH_TOOL_TYPE, TOOL_SEARCH_SUPPORTED_MODELS constants - Refactor modelSupportsToolSearch to use version parsing instead of prefix list - Add models filter to ToolSearchTool registration for Claude Sonnet/Opus 4.5+ - Fix MockEndpoint to derive supportsToolSearch from model family - Gate advanced-tool-use beta header directly on endpoint.supportsToolSearch - Update prompts to use client-side tool_search name and semantic search instructions - Update 24 snapshot files to reflect new tool name and instructionsgithub.com-microsoft-vscode · d75a4078 · 2026-04-16
- 0.6ETVReject path traversal in Create Workspace file tree (#318057) - fileTreeParser: reject node names that are empty, '.', '..', or contain '/' or '\\'; throw on unsafe project root names. Filters unsafe child node names from the parsed tree. - newWorkspaceFollowup: replace the platform-aware path.relative destination computation (which resolved a relative projectRoot against process.cwd() on Windows) with a posix prefix-strip helper, resolveProjectFileUri. Add a runtime isUriContained guard before writeFile so any traversal that slips past the parser cannot escape the generated workspace folder. - Tests: cover unsafe node names, the PoC tree, isUriContained edge cases (prefix collision, scheme/authority, trailing slash), and resolveProjectFileUri for both copilot and GitHub repo-template path shapes.github.com-microsoft-vscode · 3027c82e · 2026-05-22
- 0.5ETVAdd PDF document support for Anthropic Claude models (#4473) * Add PDF document support for Anthropic Claude models * Update @vscode/prompt-tsx to version 0.4.0-alpha.8 in package-lock.json * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * pr comments * Address PR feedback: improve test type-safety and PDF document assertions - Replace 'as any' casts with proper types (TokenizerType.O200K, Event.None, ITokenizer) - MockEndpointProvider now implements IEndpointProvider - Add hasDocumentNode() helper to verify Document nodes in JSON tree - Use PromptRenderer.create() for PDF test to fix countMessageTokens error - Assert Document node presence/absence instead of empty text checks * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Use Raw.ChatMessage output for PDF test assertions instead of JSONTree internals - Replace hasDocumentNode (magic ctor ID) with hasDocumentContentPart (Raw.ChatCompletionContentPartKind.Document) - All PDF tests now use PromptRenderer.create().render() for stable assertions - Remove PieceCtorKind and PromptNodeType.Opaque constants (no longer needed) --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>github.com-microsoft-vscode · 6a2633ac · 2026-03-18
- 0.5ETVKeep historical user messages cache-safe (#315194) When a historical turn was missing RenderedUserMessageMetadata (older sessions, freeze-not-fired paths), AgentUserMessage fell through to the current-turn render path and embedded live workspace state (<editorContext>, terminal, todos, reminderInstructions) into a past user message, breaking the prompt cache prefix on every request. - AgentUserMessage now delegates to AgentUserMessageInHistory in this fallback, matching what the other history renderers already do. - Thread userQueryTagName through AgentUserMessageInHistory and AgentConversationHistory so historical turns honor model-specific tag names (e.g. <user_query> for Grok).github.com-microsoft-vscode · 80b2b692 · 2026-05-08
- 0.5ETVimplement custom tool search functionality with embeddings-based query (#4214) * implement custom tool search functionality with embeddings-based query support * Add tool search functionality with embeddings-based query supportgithub.com-microsoft-vscode · b4a27c81 · 2026-03-04
- 0.5ETVAdd Claude 4.6 prompt optimization A/B test configurations (#4316) * Add Claude 4.6 prompt optimization A/B test configurations Implement three-way prompt optimization experiment for Claude 4.6 models: - Control: existing Claude46DefaultPrompt (no change) - Combined: single optimized prompt for both Opus and Sonnet with moderate exploration guidance - Split: separate Opus-specific (bounded exploration) and Sonnet-specific (full persistence) prompts * Optimize Claude 4.6 prompt configurations with type adjustments and conditional rendering for tool instructionsgithub.com-microsoft-vscode · 53a865cd · 2026-03-10
- 0.5ETVfeat: add per-model capability overrides for advanced configuration (#317237)github.com-microsoft-vscode · d8e88906 · 2026-05-19
- 0.5ETVfeat: formalize deferred vs non-deferred tool status as opt-in on ICopilotTools (#4750) Move the hardcoded nonDeferredToolNames set from the Anthropic networking layer to an opt-in `nonDeferred` static property on each tool class. - Add `nonDeferred?: boolean` to ICopilotToolCtor interface - Add IToolDeferralService with isNonDeferredTool() for DI-based access - Mark all 18 frequently-used tools as nonDeferred = true at declaration - Register IToolDeferralService in prod, unit test, and vscode-node test - Remove centralized nonDeferredToolNames set from anthropic.ts - Fix latent bug: old 'ask_questions' entry never matched the real 'vscode_askQuestions' tool name; now correctly included via ToolName.CoreAskQuestions Fixes microsoft/vscode#303545github.com-microsoft-vscode · 46ab74fb · 2026-03-27
- 0.4ETVOptimize prompt cache hit rate by freezing deferred tool list in initial context (#312577) Move deferred tool list out of system prompt for cache hit rategithub.com-microsoft-vscode · 602d64e0 · 2026-04-26
- 0.4ETVFilter non-deferred tools from Responses tool search output (#313169)github.com-microsoft-vscode · 22402015 · 2026-04-29
- 0.4ETVRefactor thinking and effort control: per-request opt-in (#4515) * Refactor thinking and effort control: make per-request opt-in via enableThinking and reasoningEffort - Add reasoning_effort to IChatModelCapabilities from CAPI model list - Add supportsReasoningEffort on ChatEndpoint/IChatEndpoint - Add enableThinking and reasoningEffort to IMakeChatRequestOptions - Build configurationSchema on VS Code LM API models for model picker effort dropdown - Remove disableThinking, AnthropicThinkingEffort, ResponsesApiReasoningEffort configs - Thinking is off by default; callers opt in with enableThinking: true - Agent mode (toolCallingLoop): enables thinking, passes reasoningEffort from modelConfiguration - ResponsesProxy / MessagesProxy: enables thinking - Inline chat, utility requests, LM wrapper: thinking off (default) - Effort level driven by configurationSchema in model picker (no default, user must choose) - BYOK Anthropic provider reads effort from options.modelConfiguration * refactor: Improve reasoningEffort handling across multiple components * Fix tests: add enableThinking: true to Agent location tests, restore maxThinkingBudget cap * Add defaultReasoningEffort, thread enableThinking/reasoningEffort to subagent loops and proxy endpoints - Add defaultReasoningEffort to IChatEndpoint (computed per model family: high for Anthropic/Gemini, medium for OpenAI) - Use defaultReasoningEffort as fallback in responsesApi, messagesApi, and configurationSchema - Delegate supportsReasoningEffort/defaultReasoningEffort in pass-through endpoints - Thread enableThinking/reasoningEffort through execution and search subagent loops - Add enableThinking: true to oaiLanguageModelServer and claudeLanguageModelServer - Restore maxThinkingBudget cap in customizeCapiBody * refactor: Adjust thinking budget calculation to use endpoint's maxThinkingBudget * Address PR feedback: fix comment, validate effort, remove defaultReasoningEffort - Fix misleading comment in messagesApi (thinking gated by enableThinking, not reasoningEffort) - Validate reasoningEffort against known values before sending to Messages API - Remove defaultReasoningEffort from IChatEndpoint and ChatEndpoint - Compute picker default locally in buildConfigurationSchema (UI concern only) - Remove effort fallbacks from messagesApi and responsesApi (pure caller control) * Address PR feedback round 2: validate effort, conditional schema default, location-gated thinking in fetch - Validate reasoningEffort against known values in messagesApi before sending - Fix comment to reflect enableThinking gating (not reasoningEffort) - Remove defaultReasoningEffort from endpoint (picker default is UI-only concern) - Compute picker default locally in buildConfigurationSchema - Gate thinking by location in DefaultToolCallingLoop.fetch() (Agent/MessagesProxy only) - Remove enableThinking from IToolCallingLoopOptions (decision made at fetch level) - Validate effort in BYOK anthropicProvider * refactor: Enable effort picker only for Claude and GPT models in configuration schemagithub.com-microsoft-vscode · adeddfb1 · 2026-03-20
- 0.4ETVTrack compaction summaries as an array with detailed metrics metadata (#4413) * Add Anthropic compaction metadata to IResultMetadata for evals Captures context editing compaction data from Anthropic Messages API and surfaces it through IResultMetadata so the evaluation system can access it. Changes: - Add anthropicCompaction field to IToolCallRound (parallels OpenAI compaction) - Capture Anthropic ContextManagementResponse deltas in tool calling loop - Aggregate compaction metrics (cleared tokens, tool uses, thinking turns) across rounds - Surface compactionMetrics on IResultMetadata via AnthropicCompactionMetadata - Merge into result in defaultIntentRequestHandler's resultWithMetadatas() * Add compaction metrics metadata for evals Surfaces background and foreground compaction (conversation summarization) metrics through IResultMetadata.compactionMetrics so the eval system can track when compaction is triggered and its cost. - Add compactionMetrics to IResultMetadata with type (foreground/background) and token usage - Create CompactionMetadata class on Turn - Set CompactionMetadata in agentIntent for both foreground and background paths - Merge CompactionMetadata into result in defaultIntentRequestHandler * Add durationMs to summarization metadata and update related logic * Refactor compaction metadata handling to use SummarizedConversationHistoryMetadata and remove CompactionMetadata references * Add usage metadata to AgentIntentInvocation and remove compaction metrics documentation * Enhance summary handling by replacing single summary with an array of summaries and updating related logic in conversation normalization * Refactor _persistSummaryOnTurn to use IBackgroundSummarizationResult for improved type safety * Simplify render result handling in AgentIntentInvocation by directly returning the await result * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Refactor SummarizedConversationHistoryMetadata to use options object for improved readability and maintainability --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>github.com-microsoft-vscode · f487a678 · 2026-03-15
- 0.4ETVFix: strip tool_search messages from summarization to prevent tool_reference errors (#4993) The summarization call uses ChatLocation.Other but createMessagesRequestBody still converts custom tool_search results to tool_reference blocks because customToolSearchEnabled isn't gated by isAllowedConversationAgent. Without tool search enabled in the summarization request, Anthropic rejects tool_reference content blocks. Strip tool_search tool_use/tool_result pairs from messages before sending the summarization request.github.com-microsoft-vscode · 68f93e1d · 2026-04-06
- 0.3ETVBake transcript pointer into conversation summaries at creation time (#311192) Append a stable transcript-file hint (path + line-count snapshot) to every conversation summary, so after compaction the model can read the uncompacted transcript on disk. The hint is appended exactly once at summary-creation time and stored on round.summary / turn metadata. Subsequent renders replay that string byte-identically, preserving Anthropic prompt cache hits even as the transcript keeps growing. Covers all three summarization paths: - Full / Simple via ConversationHistorySummarizer.summarizeHistory() - Inline background via agentIntent.ts _startBackgroundSummarization (flushes the transcript before snapshotting the line count so the baked count matches the on-disk file) Shared via new exported helper appendTranscriptHintToSummary.github.com-microsoft-vscode · b88bc8ee · 2026-04-19
- 0.3ETVfix: preserve full MCP tool schemas for Anthropic API (#4773) Strip $schema but preserve $defs, additionalProperties, and other JSON Schema keywords when building Anthropic tool input_schema. Previously the CAPI and BYOK paths only extracted properties/required, which broke tools using $ref/$defs. Extract shared buildToolInputSchema() helper used by both messagesApi.ts (CAPI) and anthropicProvider.ts (BYOK). Fixes microsoft/vscode#303990github.com-microsoft-vscode · c0238e82 · 2026-03-27