Tobias Koppers
tobias.koppers@googlemail.com
90d · built 2026-05-28
90-day totals
- Commits
- 125
- Grow
- 13.4
- Maintenance
- 17.8
- Fixes
- 5.8
- Total ETV
- 37.1
Where this dev ranks
Percentile against the global top-100 leaderboard (all-time totals).
- By commits
- Top 34 %
- By Growth share
- Top 59 %
30-day trajectory
Last 30 days vs. the 30 days before. Up arrows on Growth and ETV mean improvement; up arrow on Fixes share means more time on fixes (worse).
Daily performance
Daily ETV, stacked by Growth, Maintenance and Fixes.
Work-mix over time
Share of Growth / Maintenance / Fixes over a rolling 7-day window. Reads as 'where is effort flowing right now'.
Bug flow over time
Monthly bug flow attributed to this developer. The left bar (red) is bug impact this dev authored that was addressed in the given month — combining bugs others fixed for them and bugs they fixed themselves. The right bar is fixes they personally shipped that month, split between self-fixes (overlap with the red bar) and fixes done for someone else. X-axis is fix-time, not introduction-time — the Navigara API attributes bugs backward to the author at the moment the fix lands.
- Self-fix share
- 76%
- Bugs you introduced
- 32.5
- Bugs you fixed
- 19.5
Repository spread
Where this developer's commits land. Concentrated work (top1 > 80%) vs polymath spread (top1 < 30%).
Most impactful commits
Top 20 by ETV in the 90-day window.
- 3.6ETV[Turbopack] Add graph-based CSS chunking algorithm behind experimental.cssChunking: "graph" (#93606) ### What? Adds an alternative CSS chunking algorithm to Turbopack, opted into via: ```js // next.config.js module.exports = { experimental: { cssChunking: 'graph', // or, with explicit cost overrides: // cssChunking: { type: 'graph', requestCost: 20_000, moduleFactorCost: 1 }, }, } ``` The new algorithm is **off by default** — Turbopack still uses the existing "loose"/dependencies algorithm unless this flag is set, so this PR is a pure addition for users that don't opt in. While we were here, the `experimental.cssChunking` shape was also generalized so every existing string accepts an object form too: | Value | Bundler | Notes | |---|---|---| | `true` / `'loose'` / `{ type: 'loose' }` | both | default heuristic-based chunking | | `'strict'` / `{ type: 'strict' }` | webpack | unchanged | | `false` | webpack | unchanged (one chunk per CSS module) | | `'graph'` / `{ type: 'graph', requestCost?, moduleFactorCost? }` | Turbopack | new | Cross-bundler combinations are rejected at config-validation time: - `'graph'` with webpack throws. - `'strict'` and `false` with Turbopack throw. ### Why? The existing Turbopack CSS chunker (loose / dependencies) is good at preserving CSS ordering but doesn't share chunks across pages well — every page tends to load its own chunk per CSS module, which scales poorly for apps with many pages and shared component libraries. The new "graph" algorithm models the per-chunk-group CSS ordering as a weighted DAG over modules, then greedily merges adjacent runs in the global topological order whenever the merge reduces total cost. The cost model charges every CSS request and overshipped byte, with two tunable knobs (`requestCost` and `moduleFactorCost`). **Trade-off vs. the loose default.** With the default cost parameters (`requestCost: 20_000`, `moduleFactorCost: 1`) the graph algorithm typically ships **less CSS per chunk group at the cost of more requests** than the loose algorithm. The cost model is tuned to avoid overshipping unrelated CSS into pages that don't need it; on apps where the loose algorithm was already collapsing a lot into one big chunk that some pages didn't actually use, the graph algorithm will split it. Apps that prefer fewer requests can raise `requestCost`; apps that prefer less overshipping can raise `moduleFactorCost`. This is opt-in and Turbopack-only because: - The cost model is sensitive to per-app properties (number of pages, size distribution of CSS modules, …) — keeping it experimental gives us room to tune defaults from real usage. - Webpack already has its own `CssChunkingPlugin` and `'strict'` mode that cover the equivalent design space; we don't want to fork that. ### Performance Measured on `vercel.com` (the full graph algorithm spans `create_graph → make_acyclic → linearize → split_into_chunks → assemble`): - **~3s** end-to-end for the synchronous chunking pipeline on a realistic production input. Implementation choices that matter for that throughput: - Tarjan SCC uses `Vec<u32>` / `Vec<bool>` scratch arrays indexed by `NodeIndex` — no hashing on `indices` / `lowlinks` / `on_stack`. - `make_acyclic` batches multiple cuts per SCC pass by seeding successive short-cycle searches at the previous cut's target, only re-running Tarjan when no further cycle is reachable from the seed. - `find_short_cycle` is a bidirectional Dijkstra over a `BinaryHeap` with predecessor pointers (no path cloning) and skips its refinement loop for trivial 2-cycles. - `split_into_chunks` picks the next merge from a `BinaryHeap` keyed on the cost delta instead of an O(N) linear scan per merge. - `chunk_cost` reads a once-built `module_to_groups` inverse index instead of scanning every chunk group on every call; the GlobalStyle leakage check uses binary search on the inverse index rather than scanning each group's module list. ### How? #### Module layout (`turbopack/crates/turbopack-core/src/module_graph/`) The two algorithms are deliberately split so neither imports from the other: - `style_groups/` — algorithm-neutral output types (`StyleGroups`, `StyleItemInfo`, `make_style_groups`). Both algorithms produce these. - `style_groups_loose/` — the existing ("loose") algorithm plus the shared config types (`StyleGroupsAlgorithm`, `StyleGroupsConfig`, `F32TaskInput`). - `style_groups_graph/` — the new algorithm. Pure Rust, no `Vc`, with `petgraph::DiGraph` plus a thin `SubgraphView` wrapper and a small `ReadonlyGraph` trait that lets the same pipeline run against either a `&DiGraph` or a filtered view of one SCC. #### Algorithm ```text create_graph → make_acyclic → linearize → split_into_chunks → assemble batches ``` 1. **`create_graph`** — for each chunk group, every `(later, earlier)` pair inside the group's CSS-module list becomes an edge `later → earlier` (weight 1, accumulated). Heavy edges = strong co-occurrence. 2. **`make_acyclic`** — co-occurrence almost always introduces cycles; each multi-node SCC has its lowest-weight cycle edge cut until the graph is a DAG. 3. **`linearize`** — Kahn-style topological sort with a tie-break on edge weight, so strongly co-occurring modules end up adjacent in the global order. 4. **`split_into_chunks`** — greedy bottom-up merger over the global order. At every active split point we score the merge as `cost(merged) - cost(left) - cost(right)`, take the most-negative score from a min-heap, and repeat until no merge would reduce cost. `max_chunk_size` and "global CSS must not leak into unrelated chunk groups" are enforced as `+infinity` cost. The cost model is: ```text cost_per_group(chunk, group) = chunk_size + (chunk_size / group_total_size) * module_factor_cost + request_cost ``` summed over the chunk groups that load the chunk. #### Wiring - `StyleGroups::shared_chunk_items` is a `FxIndexMap<ChunkItemWithAsyncModuleInfo, StyleItemInfo>` where `StyleItemInfo { order: Option<u32>, batch: Option<…> }`. The graph algorithm fills `order` so `style_production.rs` can stable-sort chunks globally; the legacy algorithm leaves `order = None`, which makes the sort a no-op for it. `flatten_and_sort` returns the `StyleItemInfo` references alongside each chunk item so the per-item loop doesn't re-query the map. - A new `StyleGroupsAlgorithm` enum on `ChunkingConfig` selects the algorithm at chunking time; `ModuleGraph::style_groups` dispatches to either `compute_style_groups` (existing) or `compute_style_groups_graph` (new). - `next-core` exposes `NextConfig::css_chunking() -> Vc<CssChunkingAlgorithm>` resolving the JS `experimental.cssChunking` to the Rust enum, with cost defaults applied (`requestCost: 20_000`, `moduleFactorCost: 1`). All three chunking-context constructors (`next_client`, `next_edge`, `next_server`) thread it through. #### Configuration - `experimental.cssChunking` zod schema accepts the new shapes; cost params are `z.number().nonnegative().finite().optional()`. - `config-shared.ts` exports a `CssChunkingConfig` type alias and a `resolveCssChunkingMode(value)` helper that normalizes any input to one of `'off' | 'loose' | 'strict' | 'graph'`. Both `webpack-config.ts` (plugin wiring) and `config.ts` (bundler-compat validation) use the helper. - New `errors.json` entries for the three bundler-compatibility validation errors (E1193 graph-on-webpack, E1194 strict-on-Turbopack, E1195 false-on-Turbopack). #### Tests - 53 Rust unit tests in `style_groups_graph/tests.rs` cover `create_graph`, Tarjan SCC, `find_short_cycle` (bidirectional Dijkstra), `make_acyclic`, `linearize`, `split_into_chunks`, and end-to-end pipeline scenarios. - `test/e2e/app-dir/css-order/css-order.test.ts` is parametrised over `[label, value]` pairs. The Turbopack matrix now includes `'graph'` and an object-form `{ type: 'graph', requestCost: 1, moduleFactorCost: 1 }` in addition to the existing default. Per-page expectations grew a `requests` object encoding distinct request counts for `loose` and `graph` where they differ. - A new `sandwich` e2e fixture (`/sandwich/a`, `/sandwich/b`) exercises the case where two pages share a leading and trailing chunk around a unique middle stylesheet — including a global stylesheet that the algorithm must not leak into unrelated chunk groups. The graph algorithm hits the optimal 3 chunks per page on this fixture; loose mode falls short. #### Documentation - `ExperimentalConfig.cssChunking` JSDoc describes every accepted shape and what each cost knob does. - The `style_groups_graph` module-level docs describe the pipeline, cost model and constraints with diagrams. Closes NEXT- <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: v-work-app[bot] <262237222+v-work-app[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>github.com-vercel-next.js · e2fca6c6 · 2026-05-18
- 1.9ETVturbo-persistence: stop background persisting after unrecoverable failure (#92106) ### What? When a persist or compaction operation fails in `turbo-persistence`, the database now: - Rolls back cleanly (deletes orphan files, restores CURRENT) - Stops the background persisting process for the session - Keeps in-memory state consistent with on-disk state at all times - Deletes superseded files safely (with Windows fallback for open memory maps) ### Why? Previously, a failed write operation (e.g. disk full, I/O error) would leave the database in a broken state: 1. **Misleading error loop** — The `active_write_operation` `AtomicBool` was left set to `true` after failure, so every subsequent snapshot cycle printed _"another write operation is already in progress"_ forever, hiding the real error. 2. **In-memory corruption** — `commit()` mutated `inner.meta_files` and `inner.current_sequence_number` *before* writing the CURRENT file to disk. If a disk error occurred between those two steps, the in-memory state was inconsistent with disk and the rollback had no way to fix it. 3. **Rollback could corrupt committed data** — If `commit()` failed *after* writing CURRENT (e.g. during old-file deletion or LOG writing), the rollback would delete the *newly committed* files, corrupting the database. 4. **Task graph corruption** — `save_snapshot` consumes task cache log entries. If it failed, those entries were lost, but the background loop would continue trying to persist — silently skipping those tasks and corrupting the task graph in storage. 5. **Partially written CURRENT** — If the failure happened mid-write to the CURRENT file, it could be left with partial/corrupt content, but nothing restored it. ### How? **`WriteOperationGuard` RAII (db.rs)** A new `WriteOperationGuard<'a>` replaces the `AtomicBool` + manual `try_recover_after_failed_write()` pattern. The guard holds: - `&'a Mutex<Option<ActiveWriteState>>` — the write slot (`None` = idle, `Some(Active("write batch"))` = in progress, `Some(Error)` = permanently disabled) - `path: &'a Path` — database directory for rollback - `seq_before: u32` — sequence number at operation start - `succeeded: bool` — set by `guard.success()` On `drop`, if not succeeded: 1. Writes `seq_before` back to CURRENT (repairs a partially-written CURRENT) 2. Deletes all files with `seq > seq_before` (orphans from the failed operation) 3. Sets the slot to `None` (success) or `Some(Error)` (if cleanup itself failed) The `Active` variant carries a `&'static str` name (e.g. `"write batch"`, `"compaction"`) used in error messages. **Three-phase `commit()` (db.rs)** `commit()` is restructured so `inner` is completely unmodified before the point of no return: | Phase | What happens | `inner` state | On failure | |-------|-------------|---------------|------------| | **A** | Compute `meta_seq_numbers_to_delete` via `sst_filter`. Uses `apply_filter_collect` (read-only) to update filter state and collect per-meta-file removal sets without modifying any MetaFile. Only a read lock on `inner` is needed. | Unchanged | Guard deletes orphan files + restores CURRENT; `inner` is intact | | **B** | Write `.del` file and CURRENT to disk. | Unchanged | Same as above | | **C** | Apply deferred `retain_entries` (from A's removal sets), append new metas, remove obsolete metas, bump `current_sequence_number`. Try to delete superseded files; defer failures. | Updated | CURRENT is already durable; commit is irreversible | After CURRENT is written (point of no return), LOG writing errors are caught and reported via `eprintln!` — they must not propagate because the `WriteOperationGuard` would then run its rollback and delete the *newly committed* files. **`SstFilter::apply_filter_collect` (sst_filter.rs)** A new read-only variant of `apply_filter` that updates the filter state and returns a `FxHashSet<u32>` of SST entry sequence numbers to remove from each meta file, without calling `retain_entries`. The original `apply_filter` (which mutates the MetaFile) is still used by `load_directory` and during new-meta-file construction where immediate mutation is appropriate. **Deferred file deletion (db.rs)** Superseded `.sst`/`.meta`/`.blob` files are deleted immediately after Phase C (once `inner` is updated). On Linux/macOS this always succeeds, even if concurrent readers have the files memory-mapped. On Windows, open memory maps prevent deletion — any file that fails is stored as a `DeferredDeletion` enum (`Sst(u32)` / `Meta(u32)` / `Blob(u32)`) and retried on the next commit or at shutdown. The `.del` file written during Phase B ensures crash recovery via `load_directory` regardless. **Background loop error handling (backend/mod.rs)** - `snapshot_and_persist()` returns `Result<(Instant, bool), anyhow::Error>` instead of `Option`. When `save_snapshot` fails, the error propagates with `?`. - The background loop matches on the `Result`: on `Err`, it logs the error and a message that persisting is disabled for this session, then returns (permanently stopping the background job). - `has_unrecoverable_write_error()` checks the `ActiveWriteState::Error` variant to detect permanent failure after compaction errors. <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · fff4a4d9 · 2026-04-14
- 1.8ETVAdd `next internal static-routes-info` CLI command (#93399) ### What? Adds a new internal CLI subcommand: ``` next internal static-routes-info [directory] [options] ``` It runs against an already-built Next.js app (`.next/` from `next build`), reads the manifests under `distDir`, and reports per-route bundle sizes split into six file categories. Supports markdown (default) or JSON output, sorting, limiting, and per-category file listings. ### Why? We want a **static, build-output-based** way to compare different chunking strategies for JS and CSS without running the user's app. Existing tooling either requires running the app (`@next/bundle-analyzer` is a webpack plugin), is bundler-specific, or aggregates per-bundle/per-asset rather than per-route. This tool answers the concrete question "how much JS / CSS does each route ship today?" purely by reading static manifests, so it can be diffed across builds, branches, or bundlers (Turbopack vs webpack) and used to evaluate chunking changes. It is namespaced under `next internal` because the output format and category boundaries are tied to internal manifest shapes; we don't intend to make it a stable public surface yet. ### How? #### Command surface ``` Usage: next internal static-routes-info [directory] [options] Options: --json Output as JSON instead of markdown. --limit <n> Only show the first N routes after sorting (totals always reflect all routes). --sort <key> Sort routes by: name (default, ascending), or one of client, client-js, client-css, client-map, server, server-bundled-js, server-unbundled, server-map, total (descending). --files Include the list of files (relative to the output directory) per category in the JSON output. Requires --json. -h, --help Displays this message. ``` `--limit` and `--sort` always consider every route for the totals; only the displayed table is trimmed/reordered. `--files` is only meaningful in JSON output and errors otherwise. Invalid `--sort` keys error with the valid set listed. #### Six categories per route Each file the tool sees is placed into **exactly one** of these six buckets, so totals are not double-counted: | Category | Description | | ------------------ | -------------------------------------------------------------------------------------------------- | | Client JS | `.js` chunks loaded by the browser | | Client CSS | `.css` files loaded by the browser | | Client Source Maps | `.map` files for client JS / CSS | | Server Bundled JS | `.js` chunks executed on the server (App Router, Pages SSR, route handlers, middleware) | | Server Unbundled | Files traced via `*.nft.json` outside `distDir` (typically `node_modules` deps for `serverExternalPackages` / Pages SSR) | | Server Source Maps | `.map` files for server JS, including `.map`s referenced from nft.json | Source maps are discovered three ways: `.map` extension matches, `//# sourceMappingURL=...` trailers in JS, and `/*# sourceMappingURL=...*/` trailers in CSS. Maps always go into the `Maps` category even when the manifest puts them next to their bundle, so they never inflate Bundled or CSS counts. `sourceMappingURL` reads are memoized per chunk so a chunk shared by N routes is opened once. #### Two-step measurement 1. **Capture per-route file sets** by reading manifests: - `pages-manifest.json` and `app-path-routes-manifest.json` for the route list. - `<entry>.nft.json` for server-bundled chunks plus traced node_modules deps. - `<entry>_client-reference-manifest.js` for App Router client JS/CSS (both `entryJSFiles` / `entryCSSFiles` on Turbopack and `clientModules.chunks` on webpack — the parser handles both layouts). - `build-manifest.json` for shared App Router root chunks (`rootMainFiles`) and Pages Router client chunks. - `middleware-manifest.json` for middleware and edge route handlers. 2. **Deduplicate inside each per-route category, then measure.** A global `lstat` cache stats every unique path once across the whole run; per-category sets dedupe via string equality. Files are routed by extension (`.map` → Maps, `.css` → CSS, `.js` → bundled / unbundled depending on whether the path stays inside `distDir`) at the point they enter a set, so e.g. an `.nft.json` referencing both bundle and `.map` paths places each in the right bucket. #### Shared metric For each route, every category also carries a `sharedAvg`: the average size of the *intersection* between this route and each peer route of the same type. Computed as ``` sharedAvg = (Σ over peers p: |files(this) ∩ files(p)|) / number_of_peers ``` with both file count and bytes reported. The metric is also expressed as a percentage of the route's own count and bytes (`percentCount`, `percentBytes`) to make sharing easy to interpret at a glance — e.g. `5.3 files (88%) / 424.12 KB (100%)` means "88% of this route's files, and effectively all of its bytes, are also shipped by an average peer". Routes with no peers (only one route of their type) get `null`. Note that percentages are NOT commutative across peers (they're divided by each route's own count/bytes) while raw intersection numbers are. #### Output Markdown (default), with three sections — `## Routes`, `## Shared (avg per other route of same type)`, `## Totals` — each rendered as a fixed-width aligned table. Empty cells render as `-` (and routes with no peers in the Shared section as `n/a`) so meaningful values stand out: ``` ## Routes | Route | Type | Client JS | Client CSS | Client Source Maps | Server Bundled JS | Server Unbundled | Server Source Maps | | ------------ | ---------- | ------------------- | --------------- | ------------------ | -------------------- | ------------------- | ------------------- | | / | app-page | 6 files / 424.40 KB | 2 files / 153 B | - | 16 files / 384.47 KB | 140 files / 1.37 MB | 16 files / 2.31 MB | | /api/edge | app-route | - | - | - | 9 files / 296.82 KB | - | 4 files / 1.53 MB | … ## Shared (avg per other route of same type) | Route | Type | Client JS | Client CSS | Client Source Maps | Server Bundled JS | Server Unbundled | Server Source Maps | | ------------ | -------- | ---------------------------------- | ---------------------------- | ------------------ | -------------------------------- | --------------------------------- | ------------------------------ | | / | app-page | 5.3 files (88%) / 424.12 KB (100%) | 1.3 files (63%) / 52 B (34%) | - | 11 files (69%) / 357.52 KB (93%) | 140 files (100%) / 1.37 MB (100%) | 11 files (69%) / 2.19 MB (95%) | … ``` JSON has the same per-category structure (count + bytes + sharedAvg + optional files list when `--files` is used) with identical category ordering: `clientJs`, `clientCss`, `clientMaps`, `serverBundled`, `serverUnbundled`, `serverMaps`. JSON values are exact (e.g. `0/0` is preserved as `{count:0, bytes:0}` rather than `-`) so machine consumers aren't affected by the markdown placeholder. Totals also expose dedup'd `files` arrays under `--files`. #### Route types Reported types: `app-page`, `app-route`, `pages`, `pages-static`, `pages-api`, `middleware`. App Router route handlers with `runtime: 'edge'` report as `app-route` (not a separate `edge-function`) so they're directly comparable with their Node-runtime peers. Middleware is a first-class type rather than being lumped under edge-function. #### Robust manifest parsing `_client-reference-manifest.js` is a JS module, not JSON. Both bundlers emit it but with different layouts: - Turbopack: multi-line, with a `for (const key in MANIFEST[entry].clientModules) MANIFEST[entry].clientModules[k] = val` suffix when a deployment ID is set. - Webpack: single-line, no whitespace around `=`. We extract the JSON body without evaluating the file. The implementation locates the `globalThis.__RSC_MANIFEST[` anchor, walks the JS string literal that holds the entry name (honoring `\\` escapes), then balance-walks the `{...}` body. This handles entry names that contain `]` characters, e.g. ```js globalThis.__RSC_MANIFEST["/(dashboard)/[teamSlug]/(team)/~/stores/(store-details)/blob/[storeId]/page"] = {...} ``` Any structural surprise (anchor missing, unterminated string/object, JSON parse failure) throws with the file path and offset — we never silently undercount client JS/CSS for a route. Only file-not-found stays as a `null` return — that's a normal case for server entries with no client-reference manifest (middleware, route handlers, etc.). ### Tests `test/production/static-routes-info/` is a real fixture covering every route type: - App Router: `/`, `/about`, `/no-client`, `/items/[itemId]` (a dynamic segment inside a `(group)` route group, which forces `]` to appear unescaped in the manifest entry name and exercises the parser), plus the auto-generated `/_not-found`. - App Router route handlers: `/api/node` (default Node runtime), `/api/edge` (`runtime: 'edge'`). - Pages Router: `/pages-ssr`, `/pages-ssr-2` (siblings sharing chunks), `/pages-static`, `/api/hello`. - Middleware: `middleware.ts`. - A shared lib (`lib/shared.ts`) imported by both pages-router siblings and `/`-`/about` to give the shared-avg metric something non-trivial to measure. - A `'use client'` `Counter` component imported by `/` and `/about` (but not `/no-client`), which itself imports `counter.module.css`. Routes that import Counter must ship strictly more client JS (and on Turbopack, more client CSS) than `/no-client` — this is asserted, and it's the cross-bundler regression check for the App Router client-JS collection on webpack via `clientModules.chunks` (without it, every webpack app-page reports `clientJs.count = 0`). The test file (`static-routes-info.test.ts`) covers all the above plus output formats, sort options, limit semantics, file-list integrity, totals dedup, shared-avg correctness against a hand-computed reference, markdown/JSON consistency, and the empty-cell `-` placeholder. The shared-avg metric is verified three independent ways: against a from-scratch reimplementation that walks the `--files` lists and recomputes every (route, category) cell; against a "sharedAvg.count == own.count IFF every peer is a strict superset" invariant that makes 100% values load-bearing; and against a hand-known case where one route ships a chunk no peer does, forcing strictly-below-100% sharing. 31 tests, passing on both Turbopack and webpack. The tool was also exercised against `bench/basic-app`, `bench/heavy-npm-deps`, `bench/nested-deps`, `bench/app-router-server`, and `bench/nested-deps-app-router` while developing. ### Notes for reviewers - New error codes added to `errors.json` for the manifest-parser throws and other invariant violations. - The command is registered under `next internal`; not advertised in user-facing docs by design. - Webpack quirk documented in the test: `flight-manifest-plugin.ts`'s `mergeManifest` merges every app-page's `entryCSSFiles` into every other route's CRM, so per-route CSS attribution on webpack is inherently fuzzy — the test asserts CSS attribution on Turbopack only, and the comment explains why. <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: v-work-app[bot] <262237222+v-work-app[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>github.com-vercel-next.js · eab3ab87 · 2026-05-12
- 1.5ETVturbopack: cache TransformPlugin in narrow-scoped turbo-tasks functions (#92842) ### What? Refactor all usages of `EcmascriptInputTransform::Plugin(ResolvedVc::cell(Box::new(...) as _))` across `next-core` and `turbopack-tests` so that `TransformPlugin` cells are created inside dedicated `#[turbo_tasks::function]` functions rather than inline at call sites. Additionally: - Introduce a `JsonValue` newtype wrapping `serde_json::Value` that implements `TaskInput`, enabling the SWC wasm plugin list to be passed through a turbo-tasks function boundary and properly cached. - Replace the `bool` roundtrip in `next_strip_page_exports` with an `ExportFilterInput` enum that derives `TaskInput`, mirroring `ExportFilter` exhaustively (so a new upstream variant is a compile error, not a silent fallback). - Derive `TaskInput` on `ActionsTransform` and pass it directly to the cached function instead of converting to `is_server: bool` at the call site. - Replace all `.expect("... config must exist")` panics in option-gated plugin functions (`emotion`, `styled_components`, `react_remove_properties`, `remove_console`, `relay`) with `.context(...)?` for proper error propagation. - Add `// TODO: use get_ecma_transform_rule instead` comments to the ~10 transform functions that manually inline the same `ModuleRule::new` + `ExtendEcmascriptTransforms` pattern that `get_ecma_transform_rule` abstracts. ### Why? `TransformPlugin` is not serializable and not comparable. When a `TransformPlugin` is `cell`ed inline (i.e. `ResolvedVc::cell(Box::new(...) as _)`) inside a turbo-tasks function, a new cell is created on every invocation of the enclosing function, because the framework has no way to detect that the value is the same as before. This causes every task that depends on the `TransformPlugin` cell to be invalidated unnecessarily. By moving the `Vc::cell(...)` call into its own narrow-scoped `#[turbo_tasks::function]`, turbo-tasks can cache the cell by the function's inputs. If the inputs haven't changed, the function won't re-run, the existing cell is reused, and downstream tasks are not invalidated. ### How? Each inline `ResolvedVc::cell(Box::new(SomeTransformer { ... }) as _)` is replaced with: ```rust some_transform_plugin(args).to_resolved().await? #[turbo_tasks::function] fn some_transform_plugin(args: ...) -> Vc<TransformPlugin> { Vc::cell(Box::new(SomeTransformer { ... }) as Box<dyn CustomTransformer + Send + Sync>) } ``` Where a cached function needs to store a `ResolvedVc` in the resulting transformer struct, the parameter is declared as `ResolvedVc<T>` directly in the `#[turbo_tasks::function]` signature. The turbo_tasks macro rewrites `ResolvedVc<T>` → `Vc<T>` in the external call-site signature, and the call site passes a dereferenced `*resolved_vc`. This avoids a redundant `.to_resolved().await?` inside the function body. For the SWC wasm plugin case, `serde_json::Value` (used for per-plugin config) doesn't implement `Hash` or `TaskInput`. A `JsonValue` newtype is introduced with: - `#[bincode(with = "turbo_bincode::serde_self_describing")]` for serialization - Manual `Hash` impl that hashes the JSON string representation - Manual `TaskInput` impl (`is_transient = false` since the type contains no `Vc`s) <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Niklas Mischkulnig <mischnic@users.noreply.github.com>github.com-vercel-next.js · f4972228 · 2026-04-16
- 1.2ETVTurbopack: implement module.hot.accept(deps, cb) and module.hot.decline(deps) (#90443) ### What? Implements dependency-level HMR accept and decline for Turbopack, covering both ESM (`import.meta.turbopackHot`) and CJS (`module.hot`) modules. Previously Turbopack only supported self-accept (`module.hot.accept()` with no arguments) and self-decline. This PR adds the full dependency-targeted API: - `module.hot.accept(dep, cb)` / `import.meta.turbopackHot.accept(dep, cb)` — single dep - `module.hot.accept([depA, depB], cb)` / `import.meta.turbopackHot.accept([depA, depB], cb)` — array of deps - `module.hot.decline(dep)` / `import.meta.turbopackHot.decline(dep)` — single dep - `module.hot.decline([depA, depB])` / `import.meta.turbopackHot.decline([depA, depB])` — array of deps ### Why? Libraries like `react-refresh` and user code often need to accept updates for specific dependencies rather than self-accepting the entire module. Without this, those HMR patterns fall back to full page reloads in Turbopack. ### How? **Runtime** (`turbopack-ecmascript-runtime`): - Extended `HotState` with `acceptedDependencies`, `acceptedErrorHandlers`, and `declinedDependencies` maps (keyed by `ModuleId`) - Updated `hot.accept()` and `hot.decline()` to handle string and array signatures - Updated `getAffectedModuleEffects` to check accepted/declined dependencies when propagating updates through the module graph - Added `'declined'` effect type that throws an `UpdateApplyError` - Track `outdatedDependencies` (map of parent → set of updated deps) alongside `outdatedModules` - In the apply phase, invoke per-dependency accept callbacks with the correct outdated deps - Use `Set` for `outdatedDependencies` dedup (O(1) vs O(n) lookups) **Compiler** (`turbopack-ecmascript`): - Added `ModuleHotReferenceAssetReference` — a single asset reference type for both accept and decline deps, with shared resolve logic for ESM (`esm_resolve`) and CJS (`cjs_resolve`) - Added `ModuleHotReferenceCodeGen` — generates code that replaces dep string literals with resolved module IDs at compile time - ESM binding auto-update: when an ESM module accepts a dependency that it also `import`s, the compiler wraps the accept callback to re-import the namespace variable (`__TURBOPACK__imported__module__<id> = __turbopack_import__(<id>)`) before the user callback runs, so ESM bindings reflect updated values without needing `require()` - Added `import.meta.turbopackHot` as the ESM equivalent of `module.hot`, with TypeScript type declarations in `packages/next/types/global.d.ts` - Static analysis extracts dep strings from `module.hot.accept`/`decline` calls; non-analyzable deps emit a warning with distinct error codes (`TP1204` for accept, `TP1205` for decline) **HMR gating** (`CompileTimeInfo`): - Added `hot_module_replacement_enabled` flag to `CompileTimeInfo` to gate recognition of `module.hot` and `import.meta.turbopackHot` as well-known objects - Without this flag, production builds would recognize `module.hot.accept(...)` and generate HMR-specific code, leading to runtime errors - Flag set to `true` for dev servers and `false` for production builds across Next.js and turbopack-cli entry points - `import.meta.turbopackHot` getter is only emitted when HMR is enabled **Server HMR** (`next-api`, `next-core`): - Server-side `compile_time_info` now also sets `hot_module_replacement_enabled` so that `module.hot` / `import.meta.turbopackHot` are recognized during analysis of server modules - Server HMR requires the `--experimental-server-fast-refresh` CLI flag; the flag is passed through `ProjectOptions` to the Rust side so `server_compile_time_info` only enables HMR when appropriate **Tests** (`test/development/app-dir/hmr-dep-accept/`): - ESM single dep accept — verifies parent module is not re-evaluated, accept callback fires, ESM bindings auto-update - ESM array dep accept — same as above with `accept(['./dep-a', './dep-b'], cb)` - CJS `module.hot.accept` — pure CJS dep observer pattern with `.cjs` files - Single dep decline — verifies full page reload occurs - Array dep decline — verifies full page reload with `decline(['./dep-a', './dep-b'])`github.com-vercel-next.js · 014b9987 · 2026-03-09
- 1.0ETVTurbopack: Add import.meta.glob support (Vite compat) (#92640) ## What? Adds support for [Vite's `import.meta.glob`](https://vite.dev/guide/features.html#glob-import) in Turbopack. This is a compile-time transform that resolves glob patterns into a map of module paths to lazy/eager imports. ## Why? This is a commonly used Vite feature for dynamically importing groups of files (e.g., all markdown files in a directory, all route modules matching a pattern). ## How? ### Core implementation (`import_meta_glob.rs`) 1. **Analysis phase**: `import.meta.glob` is recognized as a well-known function via `WellKnownFunctionKind::ImportMetaGlob`. When called, arguments are statically analyzed to extract patterns and options. 2. **File discovery**: Uses Turbopack's `read_glob` (with `Glob::can_match_in_directory` for efficient directory pruning) to find matching files. Negative patterns (prefixed with `!`) are applied via a separate `Glob` matcher. 3. **Virtual module**: Each unique `import.meta.glob()` call generates a virtual `ImportMetaGlobAsset` module that exports an object mapping file paths to either: - **Lazy mode** (default): `() => import('./path')` thunks — resolved with `EcmaScriptModulesReferenceSubType::DynamicImport` - **Eager mode**: Direct `require('./path')` results — resolved with `EcmaScriptModulesReferenceSubType::Import` 4. **Code generation**: The `import.meta.glob(...)` call site is replaced with `__turbopack_require__(virtual_module_id)`. ### Architecture - **`ImportMetaGlobAsset`** is the virtual module. It stores only the origin, patterns, and options — no `source` reference. - **`ImportMetaGlobAsset::map()`** is a `#[turbo_tasks::function]` that builds glob matchers, scans the filesystem, and resolves all matched files as ESM imports. Being a turbo-tasks function, the result is memoised. Both `references()` and `chunk_item_content()` call this single cached function. - **`ident()`** is derived from `AssetIdent::from_path(origin_path)` with a modifier encoding all glob options, so two `import.meta.glob()` calls with different options produce different module idents. - **Side effects**: Lazy mode → `SideEffectFree` (only exports thunks, nothing evaluated). Eager mode → `ModuleEvaluationIsSideEffectFree` (the virtual module itself has no side effects, but its synchronous requires trigger real module evaluations). ### Supported options | Option | Description | |--------|-------------| | `eager` | `boolean` — load modules synchronously (default: `false`) | | `import` | `string` — select a named export (e.g., `'default'`) | | `query` | `string` — append query to imports (e.g., `'?raw'`) | | `base` | `string` — base directory for glob scanning | ### Error handling Unsupported or invalid usage produces clear compile-time diagnostics (error code `TP1008`): - `as` option → "not supported, use `query` instead" - Unknown option keys → lists supported options - Non-constant `eager` → "must be a constant boolean" - Non-constant patterns → "must be string literals" ### Not supported (intentionally) - `import.meta.globEager()` (removed in Vite 3) — users should use `{ eager: true }` - `as` option (deprecated in Vite 5) — users should use `query` ## Test Plan - [x] **Turbopack execution test** (`turbopack/crates/turbopack-tests/tests/execution/turbopack/resolving/import-meta-glob/`) - Lazy mode, eager mode, named import (`import: 'default'`), negative patterns - [x] **Turbopack snapshot tests** (`turbopack/crates/turbopack-tests/tests/snapshot/import-meta/glob/`) - Lazy glob, eager glob, named import - Negative patterns (`!**/bar.js`) - Multiple patterns (`['./dir/*.js', './other/*.js']`) - [x] **Turbopack snapshot error tests** (`turbopack/crates/turbopack-tests/tests/snapshot/import-meta/glob-error/`) - `as` option error, non-constant `eager` error, unknown option error - [x] **Next.js e2e tests** (`test/e2e/import-meta-glob/`) - Lazy/eager/named import modules, negative patterns, multi-pattern - Passes in both dev (`next dev`) and production (`next build && next start`) modes - Skipped under webpack (`IS_WEBPACK_TEST=1`) since this is a Turbopack-only feature <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Niklas Mischkulnig <mischnic@users.noreply.github.com>github.com-vercel-next.js · 38fdf0c0 · 2026-04-13
- 1.0ETVShow generated code from loaders in parse error messages (#89898) ## What? When a webpack/turbopack loader produces broken code, error messages now display **both** the original source and the generated code with source map information, making it much easier to debug loader issues. ## Why? Previously, when loaders returned invalid code, error messages only showed the original source file (after source-map remapping). Users had no way to see what the loader actually generated, making it hard to diagnose why the code failed to parse. Showing both sides gives full context about what went wrong. ## How? ### Turbopack Core (`turbopack-core`) - **`Source::description()`** — New method on the `Source` trait providing human-readable descriptions of where code comes from. Implemented across all source types (`FileSource`, `VirtualSource`, `WebpackLoadersProcessedAsset`, `PostCssTransformedAsset`, etc.), producing chains like `"loaders [sass-loader] transform of file content of ./styles.scss"`. - **`AdditionalIssueSource`** — New struct to hold a labeled source location. The `Issue` trait gains an `additional_sources()` method so issues can expose supplementary code frames. - **`GeneratedCodeSource`** — A wrapper that strips `GenerateSourceMap` support from a source, ensuring the *generated* code is displayed as-is rather than being remapped back to the original. - **`IssueSource::to_generated_code_source()`** — Helper that detects sources implementing `GenerateSourceMap` and wraps them in `GeneratedCodeSource` for display. Used by `AnalyzeIssue` and `ParsingIssue` to automatically attach generated code frames. ### Error Formatting - **`turbopack-cli-utils`** — Renders additional sources in CLI issue output. - **`format-issue.ts`** — Renders additional sources in the browser error overlay. Extracted `formatSourceCodeFrame()` helper to deduplicate code-frame rendering between primary and additional sources. - Long-line truncation (e.g. minified CSS from SCSS) is handled natively by the Rust-based `codeFrameColumns` implementation. ### Type Definitions - Added `SourcePosition`, `IssueSource`, and `AdditionalIssueSource` interfaces to TypeScript types. - Updated `PlainSource` (added `file_path`), `PlainIssue` (added `additional_sources`), and NAPI bindings to pass the data through. ### Test Coverage - **E2e tests** (`test/e2e/webpack-loader-parse-error/`) with custom broken JS and CSS loaders, covering all 4 modes: - **Development (Turbopack)** — Verifies parse errors show both original and generated code via browser error overlay - **Development (Webpack)** — Verifies error overlay shows the parse error (webpack doesn't support additional sources) - **Production (Turbopack)** — Verifies build failure output with full error extraction and inline snapshots - **Production (Webpack)** — Verifies build failure output with inline snapshots - Updated `test/development/sass-error/` snapshot to include the new generated code frame for minified SCSS output. ### Example Output When a loader produces broken code, users now see: ``` ⨯ ./app/data.broken.js:3:1 Parsing ecmascript source code failed 1 | // This file will be processed by broken-js-loader 2 | // The loader will return invalid JavaScript with a source map > 3 | export default function Data() { | ^ 4 | return <div>original source content</div> 5 | } 6 | Expected '</', got '{' Generated code of loaders [./broken-js-loader.js] transform of file content of app/data.broken.js: ./app/data.broken.js:3:46 1 | // Generated by broken-js-loader 2 | export default function Page() { > 3 | return <div>this is intentionally broken {{{ invalid jsx | ^ 4 | } 5 | Import trace: Server Component: ./app/data.broken.js ./app/page.js ``` --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Luke Sandberg <lukeisandberg@gmail.com>github.com-vercel-next.js · 261922df · 2026-03-16
- 1.0ETVfeat: add NEXT_HASH_SALT env var for content-hash filename salting (#91871) ### What? Adds a `NEXT_HASH_SALT` environment variable **and** a `experimental.outputHashSalt` config option that mix a user-supplied string into every content-addressed hash used to generate chunk filenames and static asset filenames. This works for both Webpack and Turbopack. When both are set, the values are concatenated (`outputHashSalt + NEXT_HASH_SALT`), so a per-project salt can be baked into `next.config.js` while a per-deployment salt is injected at build time via the environment variable. ### Why? Content-addressed filenames (e.g. `chunk.abc123.js`) are derived from file content, so they only change when the content changes. There are deployment scenarios where you need to force all filenames to rotate — for example after a CDN misconfiguration has poisoned caches for a particular hash space — without actually changing source code. A stable, opt-in salt lets operators do this without touching application code. Some customers prefer the config-file approach (`turbopack.outputHashSalt`) over environment variables, so both are supported. ### How? **Webpack** already has `output.hashSalt` in its config. We simply forward `NEXT_HASH_SALT` to that option. **Turbopack** required threading the value through several layers: 1. The effective hash salt is computed once in `assignDefaultsAndValidate` as `config.turbopackHashSalt = (turbopack.outputHashSalt ?? '') + (NEXT_HASH_SALT ?? '')` and stored on `NextConfigComplete`. Both `turbopackBuild` (production) and `createHotReloaderTurbopack` (dev) read from this single field. 2. `ProjectOptions.hash_salt` receives the pre-computed salt. 3. `Project` stores the salt and passes it into the three chunking context option structs (`ClientChunkingContextOptions`, `ServerChunkingContextOptions`, `EdgeChunkingContextOptions`). 4. Both `BrowserChunkingContext` and `NodeJsChunkingContext` gain a `hash_salt: RcStr` field. 5. A new `deterministic_hash_with_salt(salt, input, algorithm)` function in `turbo-tasks-hash` writes the salt bytes first, then the content bytes, into a single hasher — one pass, no hash-of-hash composition. 6. A matching `content_hash_with_salt` method is added to `FileContent` and `AssetContent`. 7. `ChunkingContext::asset_path` is changed to accept `Vc<AssetContent>` (instead of a pre-computed `Vc<RcStr>`) so the chunking context can choose the correct hash path itself. `StaticOutputAsset::path` simplifies accordingly. Without `NEXT_HASH_SALT` and without `turbopack.outputHashSalt` set, behaviour is identical to before — no hash change, no performance impact. **e2e test** (`test/production/app-dir/hash-salt/`) verifies: - Two builds with the same salt produce identical chunk and static asset filenames. - A build with a different salt produces different filenames. - `turbopack.outputHashSalt` (config) changes filenames vs no salt. - Combined config + env salt differs from either alone. - Runs for both Turbopack and Webpack. --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Luke Sandberg <lukesandberg@users.noreply.github.com>github.com-vercel-next.js · 3e015884 · 2026-04-01
- 1.0ETVturbo-tasks: fix hashed cell mode crash on task error (re-land #91576) (#92108) ### What? Re-lands #91576 ("turbo-tasks: add hashed cell mode for hash-based change detection without cell data"), which was reverted in #92103 due to a `FATAL` crash in the `filesystem-cache` test suite. Includes a bug fix on top: in `task_execution_completed_prepare`, skip updating `cell_type_max_index` when the task completed with an error. Also adds a `CellHash = [u8; 16]` type alias (requested in review) used throughout the hash pipeline. ### Why? **The original feature** (`serialization = "hash"` on `FileContent` and `Code`) stores a hash of the cell data instead of the full serialized value. On session restore, the hash is used to detect whether cell content has changed without needing the full data in memory. This avoids a large persistent cache size increase. **The bug** that caused the revert: When a task fails partway through re-execution (before recreating all the cells from its previous run), `cell_counters` only reflects the partially-executed state. The old code used those partial counters to update `cell_type_max_index`, removing entries for cell types that were not yet created at the point of failure. This caused downstream tasks that still held cell dependencies from the previous successful run to hit a hard "Cell no longer exists" error. **Concrete failure path** in `filesystem-cache rename app page` test: 1. `get_app_page_entry` runs for `/remove-me/page`, creating two `FileContent` cells (indices 0 and 1). `cell_type_max_index[FileContent] = 2` is persisted. 2. The folder is renamed (`app/remove-me` → `app/add-me`), dirtying the task. 3. On re-execution, `get_app_page_entry` fails at `config.await?` (the loader tree errors because the directory is gone) — before any `FileContent::cell()` calls. 4. `cell_counters` has no `FileContent` entry → old code removed `cell_type_max_index[FileContent]`. 5. The `parse` task tries to read `FileContent` cell 1 from `get_app_page_entry` → `cell_type_max_index` is `None` → **"Cell no longer exists" panic → FATAL error**. **Why it didn't crash before** `serialization = "hash"`: `FileContent` was previously serializable, so `parse` read stale cell data directly from `persistent_cell_data`, which `task_execution_completed_cleanup` already preserves on error. With `serialization = "hash"`, data is transient — readers fall back to `cell_type_max_index` for range validation, where a stale `None` caused the crash. ### How? #### Core feature: `serialization = "hash"` cell mode - New `SerializationMode::Hash` variant in `turbo-tasks-macros` — marks a value type as non-serializable but stores a `DeterministicHash` of the cell data for change detection. - `VcCellHashedCompareMode<T>` cell mode: compares values via `PartialEq` when available, falls back to hash comparison when transient data has been evicted. - `hashed_compare_and_update` / `hashed_compare_and_update_with_shared_reference` on `CurrentCellRef` compute and pass content hashes through the update pipeline. - Backend `update_cell` uses hash-based comparison to skip invalidation when the old cell data is unavailable but the hash matches. - `cell_data_hash: AutoMap<CellId, CellHash>` field in task storage persists hashes across sessions. - Stale `cell_data_hash` entries are cleaned up in `task_execution_completed_cleanup` alongside cell data removal. - `CellHash = [u8; 16]` type alias keeps alignment at 1 byte to avoid padding growth in `AutoMap`/`LazyField` enum variants. - Hash bytes use little-endian encoding (`to_le_bytes`) for cross-platform cache portability. #### Bug fix: preserve `cell_type_max_index` on task error In `task_execution_completed_prepare`, guard the `cell_type_max_index` update block with `if result.is_ok()`. This mirrors the existing `task_execution_completed_cleanup` behavior that already skips cell data removal when `is_error` is true, keeping `cell_type_max_index` consistent with the preserved transient cell data. #### Applied to `FileContent` and `Code` - `FileContent` uses `serialization = "hash"` — full content is persisted via a separate `PersistedFileContent` type when needed (e.g., in `DiskFileSystem::write`). - `Code` uses `serialization = "hash"` with `Arc<Vec<Mapping>>` for cheap cloning. `Code::cell_persisted()` creates a `PersistedCode` cell directly and returns `Vc<Code>` via `PersistedCode::to_code()`, avoiding an intermediate hash-mode cell. #### Other improvements - `DeterministicHash` impls for `SmallVec` and `()`. - `Xxh3Hash128Hasher::finish_bytes()` method returning `[u8; 16]`. - `hash = "manual"` option on `#[turbo_tasks::value]` to opt out of auto-deriving `DeterministicHash`. **Note:** The shutdown hang and cache poisoning fixes that were previously on this branch have been merged separately via #92254. ### Test plan - [x] `test/e2e/filesystem-cache/filesystem-cache.test.ts` passes (all 17 tests) - [x] New `turbopack/crates/turbo-tasks-backend/tests/hashed_cell_mode.rs` integration test verifies hash-based change detection: value changes trigger invalidation, equal values (same hash) do not - [x] `cargo check` passes for `turbo-tasks`, `turbo-tasks-backend`, `turbo-tasks-fs`, `turbopack-core`, `turbopack-ecmascript` - [x] CI green (attempt 2) <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · 8db9a752 · 2026-04-07
- 0.9ETVturbo-tasks: add hashed cell mode for hash-based change detection without cell data (#91576) ## Summary ### Core: hash-based cell change detection - Adds `cell_data_hash: AutoMap<CellId, [u8; 16]>` to the storage schema (`category = "data"`) so the backend can persist a hash of transient cell data across evictions - Stored as `[u8; 16]` (little-endian bytes of a u128) rather than `u128` to keep the 1-byte alignment out of `AutoMap` and therefore out of the `LazyField` enum — a bare `u128` would grow the enum from 56 to 64 bytes due to its 16-byte alignment requirement - Adds `content_hash: Option<u128>` to `UpdateCellOperation::run` and threads it through the full call chain: `CurrentCellRef` → `TurboTasksCallApi::update_own_task_cell` → `Backend::update_task_cell` → operation - **New invalidation logic** in `UpdateCellOperation::run` for the `assume_unchanged = false` path: - Old content available → real equality compare (unchanged) - Old content evicted, hashes match → write new content but **skip dependent invalidation** - Old content evicted, hashes differ (or missing) → write + invalidate as before - `cell_data_hash` is updated whenever content is written (skipped if hash is unchanged; always updated for non-serializable cells regardless of `assume_unchanged`) - Adds `hashed_compare_and_update` / `hashed_compare_and_update_with_shared_reference` methods to `CurrentCellRef` (require `T: PartialEq + DeterministicHash`); hash is computed lazily (only after equality check fails when old value is available) - Adds `VcCellHashedCompareMode<T>` in `cell_mode.rs` which implements `VcCellMode<T>` for `T: VcValueType + PartialEq + DeterministicHash` ### Macro: `serialization = "hash"` and `hash = "manual"` - **`serialization = "hash"`** — behaves like `serialization = "none"` (no disk serialization, transient) but uses `VcCellHashedCompareMode` so the stored hash prevents spurious downstream invalidation when transient data is evicted and re-executed - Valid only with `cell = "compare"` (the default); combining with `cell = "new"` or `cell = "keyed"` is a compile error - Automatically derives `DeterministicHash` on the annotated type - **`hash = "manual"`** — opt-out of the auto-derive when a custom `DeterministicHash` impl is needed (analogous to `eq = "manual"`); using `hash = "manual"` without `serialization = "hash"` is a compile error ### turbo-tasks-fs: `PersistedFileContent` - Adds `PersistedFileContent` — a mirror of `FileContent` that is returned by `Vc<FileContent>::persist()`, storing the same data but obtained through the persistent cache path - `FileContent` is switched to `serialization = "hash"` so the macro auto-derives `DeterministicHash` - `DiskFileSystem::write()` now calls `.persist().await?` before emitting the write effect, using `PersistedFileContent` for the file comparison and write — this ensures full content is in the persistent cache on restore, avoiding spurious downstream invalidation - `WriteContent::File` in `invalidator_map.rs` is updated to hold `ReadRef<PersistedFileContent>` ### turbopack-core: `Code` with hash-based serialization - Adds `DeterministicHash` impls for `SmallVec<[T; N]>` and `()` in `turbo-tasks-hash` - Switches `Code` to `serialization = "hash"` so module factory code cells use hash-based change detection - Calls `.persist()` on module factory code cells in `turbopack-ecmascript` ### next-core: duplicate asset detection and parallel emit - `emit_assets` now emits node and client assets concurrently via `try_join!` - Detects duplicate assets (same output path, different content) and returns a hard error with a diff summary — previously the last writer silently won ## Macro syntax ```rust #[turbo_tasks::value(serialization = "hash")] struct MyTransientType { ... } // opt out of auto-derived DeterministicHash when you need a custom impl: #[turbo_tasks::value(serialization = "hash", hash = "manual")] struct MyTransientType { ... } ``` ## Test plan - [x] Integration tests in `turbopack/crates/turbo-tasks-backend/tests/hashed_cell_mode.rs`: - `test_hashed_cell_mode_change_triggers_invalidation` — value change triggers consumer re-execution - `test_hashed_cell_mode_equal_value_no_invalidation` — same hash prevents consumer re-execution - [x] `cargo test -p turbo-tasks-macros-tests` passes - [x] `cargo test -p turbo-tasks-backend --test hashed_cell_mode` passes - [x] `cargo check --workspace` passes --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>github.com-vercel-next.js · 0090db22 · 2026-03-26
- 0.8ETVTurbopack: Tree shaking fixes and code organization (#89295) ## Summary A series of bug fixes, refactors, and code organization improvements to Turbopack's tree shaking and side effect optimization: - **Fix `ModulePart` handling for `FreeVarReference::EcmaScriptModule`** — pass `ModulePart` correctly so free variable references resolve to the right module part - **Extract `EcmascriptModuleRenameModule`** — split the rename/reexport logic (`RenamedExport`, `RenamedNamespace`) out of `EcmascriptModuleFacadeModule` into a dedicated `rename` module, shared between side effects optimization and tree shaking - **Simplify `EcmascriptModuleFacadeModule`** — now only handles the `Facade` case (locals + reexports), no longer overloaded with rename variants - **Reorganize tree shaking code** — moved `asset.rs` → `part/module.rs`, `chunk_item.rs` → `part/chunk_item.rs` + `side_effects/chunk_item.rs`, `side_effect_module.rs` → `side_effects/module.rs` for clearer separation between part modules and side effect modules - **Use `EsmExport` in `FindExportFromReexportsResult`** — reduces code duplication by reusing the existing `EsmExport` enum - **Minor simplifications** — code deduplication, reordered checks, and general cleanupgithub.com-vercel-next.js · ad030d78 · 2026-03-05
- 0.8ETVturbo-tasks: replace async resolve fns with custom Future types (ResolveRawVcFuture, ResolveVcFuture, ToResolvedVcFuture) (#91554) ### What? Replace the `async fn resolve()`, `async fn resolve_strongly_consistent()`, and `async fn to_resolved()` methods on `RawVc`, `Vc<T>`, and `OperationVc<T>` with hand-written custom `Future` implementations, following the existing `ReadRawVcFuture` pattern. New types: - **`ResolveRawVcFuture`** (`raw_vc.rs`) — core implementation, replaces `async fn resolve_inner()` - **`ResolveVcFuture<T>`** (`vc/mod.rs`) — typed wrapper over `ResolveRawVcFuture`, returned by `Vc::resolve()` - **`ResolveOperationVcFuture<T>`** (`vc/operation.rs`) — typed wrapper, returned by `OperationVc::resolve()` - **`ToResolvedVcFuture<T>`** (`vc/mod.rs`) — typed wrapper, returned by `Vc::to_resolved()` All new future types expose a `.strongly_consistent()` builder method, enabling `resolve_strongly_consistent()` to be replaced by `.resolve().strongly_consistent()` at call sites. `ReadRawVcFuture` is also updated to delegate its phase-1 resolve loop to `ResolveRawVcFuture` instead of duplicating the logic. `std::task::ready!` is used throughout to simplify poll implementations. Also adds `#[inline(never)]` to `ReadRawVcFuture::poll` and `ResolveRawVcFuture::poll` to avoid inlining large poll implementations into every await site. ### Why? Performance, binary size, and improved API ergonomics: - The hand-written `Future` pattern (already used by `ReadRawVcFuture`) gives the compiler more predictable, smaller code than the state machines generated for `async fn`. The `#[inline(never)]` attributes on `poll` prevent large poll bodies from being duplicated at every await site, which the async desugaring otherwise allows. - The new builder API (`.resolve().strongly_consistent()`) is more composable and removes the need for separate `_strongly_consistent` method variants, reducing the number of methods on `RawVc`/`Vc`/`OperationVc`. - Having `ReadRawVcFuture` delegate to `ResolveRawVcFuture` removes the duplicated resolve loop and ensures both paths stay in sync. ### How? - `ResolveRawVcFuture` stores `current: RawVc`, `read_output_options: ReadOutputOptions`, `strongly_consistent: bool`, and `listener: Option<EventListener>`. Its `poll` replicates the loop from the old `resolve_inner` using `try_read_task_output` / `try_read_local_output`. - On `Err(listener)` from a `try_*` call, the listener is stored in `self.listener` and `Poll::Pending` is returned. At the top of the loop, `ready!(poll_listener(...))` re-polls it and short-circuits if still pending. - Consistency is downgraded to `Eventual` after the first `TaskOutput` hop, matching the previous behavior. - `strongly_consistent: true` keeps the `SUPPRESS_EVENTUAL_CONSISTENCY_TOP_LEVEL_TASK_CHECK` suppression across all polls (same logic as `ReadRawVcFuture`). - `ReadRawVcFuture` now holds a `ResolveRawVcFuture` for phase 1 and drives it via `Pin::new(&mut self.resolve).poll(cx)` before proceeding to the cell read in phase 2. This eliminates the duplicated loop that previously existed in both types. - Typed wrappers (`ResolveVcFuture<T>`, `ResolveOperationVcFuture<T>`, `ToResolvedVcFuture<T>`) delegate `poll` to the inner `ResolveRawVcFuture` and map the output to the appropriate typed result. - `OperationVc::resolve_strongly_consistent()` is removed; 16 call sites updated to `.resolve().strongly_consistent()`. - All new types implement `Unpin` and are exported from `lib.rs`. - `std::task::ready!` is used in all `poll` implementations to reduce boilerplate. No behavioral changes — this is a pure implementation refactor. ### Binary size impact A release build (`pnpm swc-build-native --release`) was measured before and after the branch changes on the same merge-base commit (`a41bef94`): | | Size | |---|---| | Base (`a41bef94`, before branch) | 199,690,656 bytes (~190.4 MB) | | Branch (`6f7846f9`, after changes) | 199,252,384 bytes (~190.0 MB) | | **Difference** | **−438,272 bytes (−428 KB, −0.22%)** | The branch produces a slightly smaller binary. The reduction comes primarily from the `#[inline(never)]` attributes preventing large `poll` bodies from being duplicated at every await site. --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · f65b10a5 · 2026-04-04
- 0.8ETVTurbopack: switch chunk/asset hashes from hex to base40 encoding (#91137) ### What? Switch Turbopack's hash encoding for chunk and asset output filenames from hexadecimal (base16) to base40, using the alphabet \`0-9 a-z _ - ~ .\`. Version hashes (used for HMR update comparison, not filenames) use base64 instead. ### Why? Base40 encodes the same number of bits in fewer characters than hex, producing shorter output filenames. All 40 characters are RFC 3986 unreserved (URL-safe) and safe on case-insensitive filesystems (macOS HFS+/APFS, Windows NTFS). Hash truncation lengths are reduced proportionally to maintain equivalent collision resistance: | Context | Before (hex) | After (base40) | Entropy | |---|---|---|---| | Content hash in chunk filenames | 16 chars | 13 chars | ~69 bits | | Content hash in asset filenames | 8 chars | 13 chars | ~69 bits | | Ident disambiguator hash | 8 chars | 7 chars | ~37 bits | | Long-path prefix hash | 5 chars | 4 chars | ~21 bits | ### How? **New encoding module** (\`turbo-tasks-hash/src/base40.rs\`): - Defines the base40 alphabet and length constants (\`BASE40_LEN_64 = 13\`, \`BASE40_LEN_128 = 25\`) - Implements a generic \`encode_base40_fixed<N>\` helper to avoid duplication - Public API: \`encode_base40(u64) -> String\` and \`encode_base40_128(u128) -> String\` **New base64 encoding** (\`turbo-tasks-hash/src/base64.rs\`): - \`encode_base64(u64) -> String\` — 11-char base64 (no padding) for version hashes - Version hashes don't appear in URLs or filenames, so base64 is safe and shorter **New \`HashAlgorithm\` variants** (\`turbo-tasks-hash/src/lib.rs\`): - \`Xxh3Hash64Base40\` and \`Xxh3Hash128Base40\` added alongside existing hex variants - Existing hex variants kept for internal manifests and identifiers **\`ContentHashing\` moved to \`turbopack-core\`**: - Moved from \`turbopack-browser\` to \`turbopack-core/src/chunk/mod.rs\` so both \`BrowserChunkingContext\` and \`NodeJsChunkingContext\` can use it **Separate chunk vs asset content hashing**: - \`BrowserChunkingContext\`: \`content_hashing\` renamed to \`chunk_content_hashing\` (optional), new \`asset_content_hashing: ContentHashing\` field (non-optional, defaults to 13 chars) - \`NodeJsChunkingContext\`: new \`asset_content_hashing: ContentHashing\` field (non-optional, defaults to 13 chars) - Builder methods: \`use_content_hashing()\` renamed to \`chunk_content_hashing()\`, new \`asset_content_hashing()\` **Version hashes switched to base64**: - \`turbopack-nodejs/src/ecmascript/node/version.rs\` - \`turbopack-dev-server/src/html.rs\` - \`turbopack-browser/src/ecmascript/version.rs\`, \`merged/version.rs\`, \`list/version.rs\` **Other callers updated** (15 files across turbopack and next-core): - All chunk/asset content hashing switched from \`Xxh3Hash128Hex\` → \`Xxh3Hash128Base40\` - \`ContentHashing::Direct { length }\` reduced from 16 → 13 - Asset path truncations use full 13-char base40 hash (matching chunk filenames) **Exception — \`wasm_edge_var_name\`** (\`turbopack-wasm/src/lib.rs\`): - Kept as \`Xxh3Hash128Hex\` because the hash is used as part of a JavaScript variable name (\`wasm_{hash}\`), and base40 characters \`-\`, \`~\`, \`.\` are not valid JS identifier characters. **Scope — NOT changed:** - Webpack configuration (unchanged) - Internal manifests (\`routes_hashes_manifest\`, \`project_asset_hashes_manifest\`) - Internal identifiers (font naming, external module hashing, data URI sources, debug IDs) - SRI hashes (SHA-based Base64, different purpose) --------- Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · e22988e5 · 2026-03-13
- 0.7ETVturbopack: gate ValueDebugFormat and ValueDebug behind debug_assertions (#92628) ### What? Gate `ValueDebugFormat` and `ValueDebug` behind `#[cfg(debug_assertions)]` across the turbopack crates, eliminating all debug formatting machinery from release binaries entirely. ### Why? `ValueDebugFormat` and `ValueDebug` contribute to release binary bloat. The field-level formatting logic (iterating struct fields, resolving `Vc`s, formatting collections recursively) and the per-type `ValueDebug` trait registrations are purely debugging aids with no value in production binaries. **Measured impact: -7.3 MiB (-5.5%) reduction in release `libnext_napi_bindings.so` binary size** (from 126.26 MiB to 119.31 MiB). ### How? **`ValueDebug` trait** (`debug/mod.rs`): - In debug builds: full `#[turbo_tasks::value_trait(no_debug)]` with `dbg()` / `dbg_depth()` methods, `ValueDebugFormatString`, all blanket impls for collections/tuples/etc. - In release builds: empty marker trait with blanket `impl<T: ?Sized> ValueDebug for T {}`. This satisfies the supertrait bound on all value traits at zero cost — no per-type impl code is generated. **`ValueDebugFormat` trait** (`debug/mod.rs`): - The `value_debug_format` method only exists under `#[cfg(debug_assertions)]`. In release builds, the trait is still present (for derive macros to reference) but has no methods. - All blanket impls (`String`, `RcStr`, `Option`, `Vec`, `SmallVec`, `AutoSet`, `AutoMap`, `HashMap`, `FxIndexSet`, `FxIndexMap`, tuples) are gated behind `debug_assertions`. - Supporting infrastructure (`ValueDebugFormatString`, `PassthroughDebug`, `vdbg`, `internal` submodule, `value_debug_format_field`) is compiled away. **Proc-macros**: - `#[derive(ValueDebugFormat)]`: emits a full impl with `value_debug_format` method in debug builds, empty impl in release builds. - `#[derive(ValueDebug)]` and `value_impl` blocks: emit full debug impl in debug builds only — **no release impl at all** (the blanket marker trait impl covers it). - `#[turbo_tasks::value]`: transparent types get `#[cfg(debug_assertions)]` on the manual `ValueDebug` impl. Non-transparent types use `#[cfg_attr(debug_assertions, derive(turbo_tasks::debug::internal::ValueDebug))]` so the `internal` module is never referenced in release. - `#[turbo_tasks::value_trait]`: the `Dynamic/Upcast/UpcastStrict` impls for `Box<dyn ValueDebug>` are gated behind `#[cfg(debug_assertions)]`. **Callers** (`vc/mod.rs`, `vc/resolved.rs`, `read_ref.rs`, `mapped_read_ref.rs`, `macro_helpers.rs`, `alias_map.rs`): - All `impl ValueDebugFormat` blocks and their imports are gated behind `#[cfg(debug_assertions)]`. ### Verification - `cargo check --release` — clean (no errors, no warnings) - `cargo clippy --all-targets` — clean - CI passing <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · b56f1555 · 2026-04-15
- 0.7ETVturbopack: reschedule stale tasks with correct invalidation priority (#92897) ### What? When an in-progress task is invalidated during execution, it transitions to a "stale" state. Previously, on completion it was directly re-executed in the same worker slot — inheriting the original schedule priority rather than the priority from the invalidation that made it stale. ### Why? A stale task that was invalidated at low priority was being re-executed at whatever high priority the original schedule had. This caused high-priority work to be unfairly blocked or deprioritized in the scheduler. ### How? **`backend.rs` trait:** `task_execution_completed` return type changed from `bool` (reschedule yes/no) to `Option<TaskPriority>` — `None` means done, `Some(priority)` means the task was stale and must be re-executed at this priority. **`backend/mod.rs`:** The three helper functions (`_prepare`, `_connect`, `_finish`) and the main `task_execution_completed` all propagate the invalidation priority on stale returns. In each stale path, the priority is read from `task.is_dirty().unwrap_or(TaskPriority::leaf())` before the task state is mutated. **`manager.rs`:** The executor no longer loops to directly re-execute stale tasks. Instead, if `task_execution_completed` returns `Some(stale_priority)`, the task is unconditionally re-scheduled through the priority runner at that priority, so all tasks execute in the correct priority order. <!-- NEXT_JS_LLM_PR -->github.com-vercel-next.js · 191fd742 · 2026-05-08
- 0.7ETVfeat(trace-server): add query CLI and MCP API to turbopack-trace-server (#92030) ### What? Adds a [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) HTTP server and a companion CLI to the Turbopack trace server (`next internal trace`), so AI agents and developers can explore trace data from a build or dev session incrementally through a structured tool API. Two new user-facing features: 1. **`--mcp-port` flag on `next internal trace`** — starts an MCP-over-HTTP endpoint (at `/mcp`) alongside the existing WebSocket trace viewer. 2. **`next internal query-trace`** — a standalone CLI command that queries a running trace server from the terminal, using the same MCP endpoint under the hood. ### Why? Trace files from Turbopack builds can be very large. An agent querying raw trace data would be overwhelmed by tens of thousands of spans. The MCP API exposes the data incrementally: - **Pagination** (20 spans per page) prevents context overflow. - **Aggregation** groups repeated spans by name (same logic as the WebSocket viewer's graph), so agents see one entry per span type with count/total/avg statistics rather than thousands of identical entries. - **Drill-down** lets agents pick a span ID from one response and fetch its children in the next call. - **Search** filters spans by name/args using `SpanRef::search` (full subtree search index, same as the WebSocket viewer). - **Sort** orders by corrected (wall-clock) duration to surface the slowest spans first. - **Output format** — `outputType: "json"` for machine-readable structured data, `"markdown"` (default) for human-readable rendering. ### How? #### Rust (`turbopack/crates/turbopack-trace-server/src/lib.rs`) - `start_turbopack_trace_server_non_blocking(path, port) -> Arc<StoreContainer>` — starts the reader and WebSocket server on background threads and returns the store handle immediately (previously the function blocked forever). - `QueryOptions` — struct with `parent`, `aggregated`, `sort`, `search`, `page` fields. - `query_spans(store, opts) -> QueryResult` — queries up to 20 spans per page. In aggregated mode it uses the existing `SpanGraphRef` graph logic (the same grouping the WebSocket viewer uses). Waits up to 10s for the store to finish initial data loading on the first call. - Helper functions extracted: `paginate()`, `format_span_name()`, `build_span_id()` to avoid duplication between aggregated and raw code paths. #### Span ID format IDs encode both the span type and the navigation path: | Span kind | Leaf format | Example full path | |---|---|---| | Raw span | `<index>` (decimal) | `a1-a5-20` | | Aggregated span | `a<first-span-index>` | `a1-a5-a34` | Segments are joined by `-` as the caller drills deeper. `resolve_span_by_id` only needs the last segment to look up the underlying store index. #### NAPI bridge (`crates/next-napi-bindings/src/turbo_trace_server.rs`) - `TraceServerHandle` — opaque NAPI class wrapping `Arc<StoreContainer>`. - `startTurbopackTraceServerHandle(path, port)` — calls the non-blocking Rust function. - `queryTraceSpans(handle, options)` — calls `query_spans` and returns a plain JS object. - Doc comments on `SpanInfo` fields clarify that `cpu_duration`/`corrected_duration` hold the first example span's values for aggregated groups (while `total_*` fields hold group totals). #### TypeScript (`packages/next/src/...`) - `generated-native.d.ts` / `types.ts` / `index.ts` — type declarations, native wrappers, and WASM stubs. - `next.ts` — adds `--mcp-port <port>` to `next internal trace` (defaults to `5748`); registers `next internal query-trace` subcommand with `--json` flag. - `turbo-trace-server.ts` — rewrites the CLI handler to: 1. Start the WebSocket trace viewer server non-blocking (was blocking). 2. Start an HTTP server at `/mcp` running MCP `StreamableHTTPServerTransport` (stateless — one transport per request, `sessionIdGenerator: undefined`). 3. `renderSpanMarkdown()` — extracted helper that renders a single span as markdown. 4. Error handling: `server.on('error')` for `EADDRINUSE`, `try/catch` around `loadBindings()` and `startTurbopackTraceServerHandle()`. - `query-trace.ts` — new CLI command that POSTs JSON-RPC to the MCP endpoint and prints the response. On connection failure, shows instructions to start `next internal trace` first. Supports `--json` flag for structured output. #### `next internal query-trace` CLI ``` Usage: next internal query-trace [options] Options: --port <port> MCP port of the running trace server. Defaults to 5748. --parent <parent> Span ID to enumerate children of. Omit for root level. --no-aggregated Disable aggregation of spans by name (aggregated by default). --sort Sort results by corrected duration descending (default: false). --search <search> Substring filter on span name/category. --json Output as JSON instead of markdown. --page <page> Page number (1-based, default 1). -h, --help Displays this message. ``` #### Startup output When `next internal trace` starts with `--mcp-port`: ``` Turbopack trace server started. View trace at https://trace.nextjs.org?port=5747 Query this trace from the command line: next internal query-trace --help Alternatively, connect an MCP client to http://127.0.0.1:5748/mcp ``` #### Markdown output format (per span) ```markdown ### `<name>` (ID: `<id>`) - CPU Duration: … - Corrected Duration: … - Start (relative to parent): … - End (relative to parent): … **Attributes:** - `key`: value ``` For aggregated spans with count > 1, totals and averages are shown first, then one example span's raw data. ### Tests **E2E test (`test/e2e/turbopack-trace-server-query/turbopack-trace-server.test.ts`)** Uses `nextTestSetup` with `env: { NEXT_TURBOPACK_TRACING: '1' }` to produce a real trace file. Spawns `next internal trace <file> --mcp-port <port>`, waits for the MCP server to be ready, then runs both MCP HTTP and CLI queries: - Root listing (markdown format) - Aggregation mode - Pagination - Drill-down by span ID (using JSON output to extract IDs) - Search with a real span name match and a non-matching term - Sort by duration - JSON output format (`outputType: 'json'`) - CLI: `--sort`, `--search`, `--no-aggregated`, `--parent`, `--json` - Error path: connection failure message with instructions Turbopack-only (skips via `if (!isTurbopack) return` at describe level). The error-path test runs regardless of bundler. ### Usage ```bash # Build with tracing NEXT_TURBOPACK_TRACING=1 next build # Start the trace viewer + MCP server next internal trace .next-profiles/trace-turbopack --mcp-port 5748 # Query from CLI next internal query-trace --sort next internal query-trace --parent a1 --sort next internal query-trace --search "turbo_tasks::function" --page 2 next internal query-trace --no-aggregated next internal query-trace --json # Or connect an MCP client to http://127.0.0.1:5748/mcp ``` <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · 5ff5ae21 · 2026-04-14
- 0.7ETVTurbopack: lazy aggregation optimize via persistent pending flag (#93454) ### What? Make the aggregation `optimize_queue` in the Turbopack persistent backend bounded and lazy. Cap the in-memory queue size, persist an `optimization_pending` flag per task, drain the queue per `process()` call with a per-queue lifetime budget, and recover dropped optimizations opportunistically via the flag instead of via an unbounded scheduler-side queue. ### Why? The previous implementation pushed an `OptimizeJob` for every `push_optimize_task` call into a single in-memory queue with no upper bound. On large workloads (or pathological aggregation churn), this queue could grow very large and cause the thread that scheduled the optimization to do unbounded work, regressing latency for the operations that triggered the schedule. The goals of this change: - Bound the worst-case work any single `process()` call does for optimizations (per-queue budget). - Bound the in-memory queue size so memory use is predictable. - Avoid losing optimizations: anything we drop must be eventually recovered. - Keep the common fast path cheap — no extra `Meta`-category guard acquisitions when the optimization flows through normally. ### How? Persist a new `optimization_pending` flag on `TaskStorage` (`storage_schema.rs`) and use it to drive lazy recovery in `aggregation_update.rs`: - `push_optimize_task` only enqueues an in-memory `OptimizeJob` if the queue is under `MAX_OPTIMIZE_QUEUE_SIZE` (10000) and the per-queue lifetime budget `MAX_OPTIMIZATIONS_PER_QUEUE` (1000) hasn't been exhausted. If we can't enqueue, we set `optimization_pending = true` on the task so a future operation that visits this task will re-discover and re-enqueue the optimization. - Every `AggregationUpdateJob` handler calls `check_optimization_pending` on the primary task(s) it touches, which re-enqueues the optimize job if the flag is set (and the queue/budget allow). - `process()` drains the `optimize_queue` one job at a time (preserving the original "root first" ordering), counting against the per-queue budget. Once the budget is exhausted, further `OptimizeJob`s in the queue are dropped and the flag is left set on those tasks (so they recover later). - `optimize_task` clears `optimization_pending` at entry so the recovery loop eventually settles. - The flag is **only** written on the drop path — the common case (enqueue → process normally) does not touch `optimization_pending`, so no `Meta`-category guard contention is added on the hot path. - `OptimizeJob` carries a best-effort `flag_already_set` snapshot so that when a job is dropped at process time and the snapshot says the flag was already set, we skip the redundant write entirely. Most jobs originating from `check_optimization_pending` (the recovery path) and `optimize_task`'s self-re-enqueue carry this hint. - `try_enqueue_optimize_job` is `#[must_use]` so the contract \"if this returns false, set the flag\" is enforced at the type level. `lock_and_mark_optimization_pending` is shared between `push_optimize_task_by_id` and the budget-exhausted drop branch. - `optimizations_executed` is intentionally persisted with the queue so that suspending and resuming the queue cannot reset the per-queue budget. <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: v-work-app[bot] <262237222+v-work-app[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>github.com-vercel-next.js · b7cc9969 · 2026-05-06
- 0.6ETVRefactor effects system: dedup/conflict detection, simplified Effect trait (#92300) ### What? Refactors the turbo-tasks effects system to support duplicate detection, conflict detection, and state tracking across effect applications. Also adds a ptr_eq fast path to `ReadRef::PartialEq`. ### Why? The existing effects system had several problems: - **No duplicate detection**: The same effect (e.g., writing the same content to the same path) would be re-applied every time, wasting I/O - **No conflict detection**: Two effects writing different content to the same path could cause an invalidation loop where the two writes invalidate each other, keeping a CPU busy-looping. On production builds this would crash with `Dependency tracking is disabled so invalidation is not allowed` - **Unnecessarily generic**: The old `DynEffect`→`DynEffectApplyFuture` chain wrapped errors in `Arc<dyn EffectError>`, plus a `DynEffectApplyFuture` chain, all to work around the fact that `anyhow::Error` doesn't implement `std::error::Error` - **Complex dedup state machine**: `EffectInstance` used a `Mutex<EffectState>` with 4 states (`NotStarted`/`Started`/`Finished`/`Invalid`), `Event`/`EventListener`, and `spawn()` to ensure idempotent application — all of which is now unnecessary since dedup is handled at the `Effects::apply` level - **Task-local context**: `ApplyEffectsContext` was a task-local typed map used only for a directory-creation cache, requiring `spawn()` + scope propagation - **Slow `ReadRef` equality**: `ReadRef::PartialEq` always performed deep content comparison even when both refs pointed to the same allocation ### How? #### Changed files | File | Change | |------|--------| | `turbo-tasks/src/effect.rs` | Rewrote `Effect` trait, `DynEffect`, `EffectStateStorage`, `Effects::apply()` | | `turbo-tasks/src/lib.rs` | Updated re-exports (`EffectError`, `EffectStateStorage`) | | `turbo-tasks/src/read_ref.rs` | Added `ptr_eq` fast path to `ReadRef::PartialEq` | | `turbo-tasks-fs/src/lib.rs` | Updated `WriteEffect`/`WriteLinkEffect` impls, restored `AnyhowWrapper`, removed `register_write_invalidator` | | `turbo-tasks-fs/src/invalidator_map.rs` | Simplified — removed `WriteContent` enum, invalidators now use `()` values | | `turbo-tasks-fs/src/invalidation.rs` | Deleted — `Write`/`WriteKind` invalidation reason types no longer needed | #### New `Effect` trait (in `turbo-tasks/src/effect.rs`) ```rust pub trait Effect: TraceRawVcs + NonLocalValue + Send + Sync + 'static { type Error: EffectError; type Value: Clone + DynPartialEq + Eq + Send + Sync + 'static; fn key(&self) -> Vec<u8>; fn value(&self) -> &Self::Value; fn state_storage(&self) -> &EffectStateStorage; fn apply(&self) -> impl Future<Output = Result<(), Self::Error>> + Send; } ``` The `Error` associated type uses `dyn std::error::Error` (via `EffectError`) rather than `anyhow::Error` to encourage structured error types that can be downcast into `Issue`s — particularly for filesystem errors from `turbo-tasks-fs`. The current fs implementations use `AnyhowWrapper` as a bridge. #### `EffectStateStorage` New struct stored on `DiskFileSystemInner`, using a two-level locking scheme for performance: - `DashMap<Vec<u8>, Arc<EffectStateEntry>>` tracks per-key state - Each `EffectStateEntry` has a sync `parking_lot::Mutex<Option<Box<dyn Any>>>` for the last applied value (fast dedup reads) and a `tokio::sync::Mutex<()>` write lock (serializes concurrent writes to the same key) - The sync fast path avoids `.await` entirely when the stored value matches, critical for high-iteration scenarios (e.g., the `writeToDisk` test that calls `Effects::apply` 10,000+ times per route) #### `Effects::apply()` flow 1. Group effects by `key()`, cache the `(index, Arc<EffectStateEntry>)` pairs in a `OnceLock` — computed once on first call, reused on subsequent calls with **no DashMap lookup** in the hot path 2. Detect duplicates (same key + same value → keep one) and conflicts (same key + different value → error) 3. For each unique effect: - **Sync fast path**: check `last_applied` via `parking_lot::Mutex` — return immediately if value unchanged - **Slow path**: acquire `tokio::sync::Mutex` write lock, re-check, clear `last_applied` to `None` (prevents stale fast-path matches during write), apply the effect, then store the new value #### Equality dispatch `eq_value_dyn` now calls `turbo_dyn_eq_hash::DynPartialEq::dyn_partial_eq` instead of a manual downcast, using the existing `turbo-dyn-eq-hash` crate machinery. #### `ReadRef::PartialEq` ptr_eq fast path ```rust fn eq(&self, other: &Self) -> bool { Self::ptr_eq(self, other) || Self::as_raw_ref(self).eq(Self::as_raw_ref(other)) } ``` Short-circuits deep comparison when both `ReadRef`s share the same `Arc` allocation — the common case in the effects dedup path where the stored and new `ReadRef` originate from the same turbo-tasks cell. #### `Effect::value()` returns `&Self::Value` Avoids cloning `ReadRef` (Arc increment) on every `eq_value_dyn` comparison. The reference goes straight to the stored field. Only `value_dyn()` (called when actually storing a new value after a write) clones. #### Simplified internals - `EffectInstance` reduced from `Mutex<EffectState>` state machine to plain `Box<dyn DynEffect>` - `DynEffect` trait gains `key()`, `eq_value_dyn()`, `value_dyn()`, `state_storage()` for object-safe dispatch; `dyn_apply()` converts `Self::Error` to `anyhow::Error` via the `Into` blanket impl - `Effects::apply()` now performs grouping, dedup, conflict detection, and per-key state-aware application #### Removed - `EffectState` enum (4 variants) + associated event machinery - `ApplyEffectsContext` + `APPLY_EFFECTS_CONTEXT` task-local - `DiskFileSystemApplyContext` struct - `register_write_invalidator` (complex write-invalidation tracking) - `WriteContent` enum from `InvalidatorMap` - `Write`/`WriteKind` invalidation reason types (deleted `invalidation.rs`) #### Kept from original design - `EffectError` trait + `AnyhowWrapper` in `turbo-tasks-fs` — preserved for future structured filesystem error types (to produce typed `Issue`s from write failures) - `invalidate_from_write` — simplified to just remove+fire path invalidators for read-task invalidation after writes - Directory creation cache — moved from task-local `ApplyEffectsContext` to `DashMap<PathBuf, ()>` on `DiskFileSystemInner` #### Test changes - `test_symlink_stress`: Updated to deduplicate random updates per symlink index before writing (the test previously relied on silent conflict resolution) - `is_sync_and_send` test: Relaxed to `is_send` since the new `Effects::apply()` future holds a `Pin<Box<dyn Future + Send>>` across await points (not `Sync`, but `Send` is sufficient) ### Testing - [x] `cargo check --workspace` — zero errors, zero warnings - [x] `cargo clippy -p turbo-tasks --all-targets -- -D warnings -A deprecated` — clean - [x] `ast-grep scan` — clean (all `Err(anyhow!())` replaced with `bail!()`) - [x] `cargo test -p turbo-tasks --lib` — 50 passed - [x] `cargo test -p turbo-tasks-fs --lib` — 110 passed - [x] `cargo test -p turbo-tasks-backend --lib` — 35 passed - [x] `cargo test -p turbopack-tests` — 89 passed <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>github.com-vercel-next.js · a98213cb · 2026-04-10
- 0.6ETVTurbopack: Add importModule() support to webpack loaders (#89630) ## What? Adds support for `this.importModule()` in Turbopack's webpack loader compatibility layer. This API allows webpack loaders to dynamically import and execute modules (CJS, ESM, TypeScript, JSON, WebAssembly, etc.) during the build process, matching webpack's native `this.importModule()` behavior. ## Why? Webpack loaders like `@vanilla-extract/webpack-plugin`, `val-loader`, and custom loaders use `this.importModule()` to: - Load configuration files that are themselves modules with dependencies - Execute code at build time to generate derived source - Process dependency chains (e.g., a TS config that imports a CJS module that requires JSON) Without this, loaders relying on `importModule()` cannot work with Turbopack. ## How? ### Architecture The implementation follows a request/response pattern between the Node.js loader runner and the Rust-side Turbopack compiler: 1. **Loader calls `this.importModule(request)`** in the Node.js loader runner 2. **IPC message sent to Rust** with the module request and lookup path 3. **Rust resolves the module** using Turbopack's full resolver (with aliases, loader rules, etc.) 4. **Rust builds a Node.js bundle** containing the module and all its dependencies using Turbopack's code generation pipeline 5. **Bundle chunks sent back to Node.js** via IPC 6. **Node.js evaluates the bundle in-memory** using `vm.compileFunction` with a minimal CJS module system This approach leverages Turbopack's full module graph, so imported modules benefit from the same resolution (aliases, custom loaders, TypeScript support, etc.) as regular imports. ### Key Changes **Rust side (`turbopack-node/src/transforms/webpack.rs`):** - New `ImportModule` IPC response message type - Resolves modules through Turbopack's `esm_resolve` with full `AssetContext` - Generates a complete Node.js bundle (runtime + chunks + entry) for the imported module - Tracks file dependencies for proper cache invalidation **`SourceTransform` trait (`turbopack-core/src/source_transform.rs`):** - Added `asset_context` parameter to `SourceTransform::transform()` so transforms can access the full asset context for module resolution and bundling - Updated all implementors: `JsonSourceTransform`, `TextSourceTransform`, `BytesSourceTransform`, `MdxTransform`, `PostCssTransform`, `WebpackLoaderItems` **Node.js runtime (`turbopack-node/js/src/transforms/`):** - `webpack-loaders.ts`: Implements `this.importModule()` on the loader context, sends IPC request and evaluates the returned bundle - `webpack-loaders-runtime.ts`: In-memory CJS module evaluator that loads bundle chunks using `vm.compileFunction`, with support for: - Relative and absolute path resolution within the bundle - External package delegation to real Node.js `require()` - Patched `fs.createReadStream` for in-memory WebAssembly binary assets **New `ImportModule` reference subtype (`turbopack-core/src/reference_type.rs`):** - Added `EcmaScriptModulesReferenceSubType::ImportModule` to mark references created by importModule for proper handling in the module graph ### Test Coverage Comprehensive e2e test (`test/e2e/app-dir/webpack-loader-import-module/`) covering: - **TypeScript modules** with CJS and ESM dependencies - **CJS dependency chains** (TS → CJS → JSON) - **ESM `.mjs` modules** with shared dependencies - **Resolve aliases** (`alias-data` → `alias-data.mjs` via `resolveAlias` config) - **Custom loader rules** (`.custom-data` files processed by `text-to-export-loader`) - **Transitive dependencies** through aliases and custom loaders - **Turbopack-only features**: `new URL()` asset references, WebAssembly imports (`add.wasm`), and dynamic `import()` within importModule targets All tests pass for both webpack and Turbopack modes. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>github.com-vercel-next.js · bdb2f2ce · 2026-03-19
- 0.6ETVTurbopack: Fix unsound IntoIterator for ReadRef<T> (#94122) ### What? Replaces the unsound by-value `IntoIterator` impl for `ReadRef<T>` in `turbo-tasks` with a sound clone-free variant, and adapts the callers across `turbopack-*` and `next-api`. ### Why? The previous impl used `transmute_copy` to fabricate `&'static`-typed items so it could expose them through the standard `Iterator` trait. Those references were only really valid as long as the `ReadRef` inside the iterator stayed alive — but `Iterator::Item` is a fixed associated type, so once the items were stashed in futures, `Vec`s, or `serde_json` map keys, the lifetime was completely unenforced. This produced a latent use-after-free whenever something else (turbo-tasks cell eviction, an intermediate `Drop`, etc.) released the underlying storage between the iteration site and the next dereference. The observed symptom was a panic in `RcStr::as_str` during JSON serialization of `AssetHashesManifestAsset`'s manifest: ``` thread 'tokio-rt-worker' panicked at turbopack/crates/turbo-rcstr/src/lib.rs:132:52: range end index 13 out of range for slice of length 7 ``` The byte read at the inline-length position was junk left over from freed/reused memory — `len = 13` is unreachable for any legitimately-constructed inline `RcStr` (max inline length is 7 on 64-bit). The bug site was `crates/next-api/src/project_asset_hashes_manifest.rs`, which consumed an `OutputAssetsWithPaths` `ReadRef`, kept `&RcStr` references in `asset_paths` past the `try_join` that dropped the iterator, then serialized them. ### How? **`turbopack/crates/turbo-tasks/src/read_ref.rs`** — new by-value impl: ```rust pub struct ReadRefIter<T, I, J> where T: VcValueType, I: Copy + 'static, J: Iterator<Item = &'static I>, { iter: J, _read_ref: ReadRef<T>, } impl<T, I, J> Iterator for ReadRefIter<T, I, J> /* … */ { type Item = I; fn next(&mut self) -> Option<I> { self.iter.next().copied() } } impl<T, I, J> IntoIterator for ReadRef<T> where T: VcValueType, I: Copy + 'static, J: Iterator<Item = &'static I> + 'static, &'static VcReadTarget<T>: IntoIterator<Item = &'static I, IntoIter = J>, { type Item = I; type IntoIter = ReadRefIter<T, I, J>; fn into_iter(self) -> Self::IntoIter { let r: &VcReadTarget<T> = &self; // SAFETY: the fabricated `&'static` reference is only stored inside // `iter`, which lives inside the returned `ReadRefIter` alongside // the `ReadRef` that owns the data. `next()` only ever yields // `Copy`-ed-out values — no reference (with the fake `'static` // lifetime or otherwise) ever leaves the iterator. Struct-field drop // order (`iter` then `_read_ref`) drops the borrow before the // backing storage. let r = unsafe { std::mem::transmute::<&VcReadTarget<T>, &'static VcReadTarget<T>>(r) }; ReadRefIter { iter: r.into_iter(), _read_ref: self } } } ``` Key properties: - **No cloning.** Setup is one borrow + `transmute`; `next()` is `Option::copied()` (bitwise copy via the `Copy` bound), not `Clone::clone`. Nothing in the iterator clones the backing collection or its elements. - **Contained `unsafe`.** The fake `'static` reference never leaves `ReadRefIter`. `Iterator::next` yields `I` by value, so the lifetime never escapes into futures, `Vec`s, or other persistence outside the iterator. - **Drop order safe.** Struct fields drop in declaration order: `iter` (and any borrows it holds) drops before `_read_ref` (the backing `Arc`). - **`Copy` bound.** The impl is restricted to element types that are `Copy` — `ResolvedVc<_>`, integer ids, owned-tuple-of-`Copy`, etc. For non-`Copy` element types (`RcStr`, `FileSystemPath`, `PatternMatch`, `(String, _)`, `(ModuleId, ReadRef<_>)`, …) callers iterate by reference via the existing `IntoIterator for &'a ReadRef<T>` impl (`for x in &read_ref` or `read_ref.iter()`). The original buggy site in `project_asset_hashes_manifest.rs` now uses `output_assets.iter()` and keeps `&'a RcStr` references in the manifest struct. The borrow checker now enforces the lifetime that used to be faked via `transmute` — `output_assets` outlives the references because nothing consumes it, and there are no clones at the call site either. **Caller adjustments.** Touching the impl forced a sweep of all call sites that were implicitly leaning on the unsound shape (yielding `&'static`-typed items as a stand-in for owned items). The fixes fall into a small number of categories: - Drop redundant `.copied()` / `.cloned()` / `|&x| f(x)` patterns after `into_iter()` (items are owned `Copy` values now, no need to deref-and-copy). - Switch non-`Copy` element iteration to `&read_ref` / `read_ref.iter()` (e.g. `PatternMatches`, `CodeAndIds`, `UnresolvedUrlReferences`, `GraphEntries`, `Vec<RcStr>`). - Reshape `crates/next-api/src/paths.rs` helpers from `impl IntoIterator<Item = &ResolvedVc<_>>` to `impl IntoIterator<Item = ResolvedVc<_>>` — `ResolvedVc` is `Copy`, so by-value is the natural shape and it composes directly with the new by-value `ReadRef::into_iter`. Callers in `app.rs`, `pages.rs`, `middleware.rs`, `instrumentation.rs`, `font.rs` updated to match (either passing the `ReadRef`/`Vec` directly, or `.iter().copied()` for borrowed sources). - A few small follow-ups: `for (key, EndpointGroup { primary, .. }) in &entrypoint_groups` in `routes_hashes_manifest.rs` (with a borrowed `&'l str` key in the manifest); `compute_async_module_info_single(graph, result)` (no `*graph`, it's already `Copy`); `&(ty, batch)` → `(ty, batch)` destructures in `chunking/mod.rs`. ### Testing - `cac` clean across the workspace. - `ca clippy --all-targets` clean. - `ca test -p turbo-tasks-backend` — all unit + integration tests pass. - `ca test -p turbopack-tests --tests` — execution snapshot suite (218 passed, 0 failed, 1 ignored) and snapshot suite (87 passed, 0 failed). Closes NEXT- Fixes #github.com-vercel-next.js · 1b77dba6 · 2026-05-26