github.com-vercel-workflow
all · 14 devs · built 2026-06-13
Repository snapshot
Monthly reports
Highlights
- Hardened *event pagination* in the *core runtime* with robust deduplication and retry logic, as seen in [1ee63b87 · Pranay Prakash].
- Significantly improved the *trace viewer* user experience by adding a *loading skeleton* [0606949e · Mitul Shah], new *tooltip components* and *keyboard shortcut displays* [ab7e5ab7 · Mitul Shah], and enhanced search functionality [b33c5ef1 · Mitul Shah].
- Introduced an *experimental and write-only MVP for Workflow Attributes* across the *core API* and *storage layers*, enabling custom metadata attachment to workflow runs [1e6b1fde · Peter Wielander].
- Enhanced *CI/CD pipeline robustness* through new *e2e testing capabilities* for *event log race conditions* [625fab46 · Peter Wielander], improved backport attribution [250ff57b · Nathan Rajlich], and increased supply chain security by pinning GitHub Actions to commit SHAs [ee618178 · Karthik Kalyan].
- Enabled passing *WritableStream* instances from parent to child workflows, fixing double-framing issues and enhancing security for inter-workflow communication [49da6c50 · Nathan Rajlich].
- Improved *core workflow reliability* by excluding *inline step execution* from replay timeouts and introducing a *ReplayBudget* system [2a446af5 · Nathan Rajlich], and by introducing `WORLD_CONTRACT_ERROR` for fatal backend contract violations [1d3959ea · Pranay Prakash].
Observations
- The project saw a 22% decrease in commit volume (134 commits) compared to the 2-month average of 172 commits.
- The *grow score* increased by 23% (14 current vs 11 2-month average), indicating a positive shift towards new feature development.
- The *waste score* decreased by 25% (6 current vs 9 2-month average), suggesting improved code quality and reduced rework.
- A significant portion of development activity focused on fixing and stabilizing existing components, with 16 identified bug fixes across various modules, including *swc-plugin-workflow* [5dabbeec · Peter Wielander], *world-local* dependency issues [3128dfce · Allen Zhou], *AbortSignal* cancellation paths in *core* [4b5f0176 · Pranay Prakash], and *Next.js workbench* caching [c597ad99 · Nathan Rajlich].
- Multiple commits were dedicated to improving the *trace viewer* UI/UX and search capabilities, indicating a continuous effort to enhance observability tools.
- There was a consistent focus on *CI/CD pipeline improvements*, including testing, security hardening, and backporting mechanisms, as evidenced by commits like [625fab46 · Peter Wielander], [ee618178 · Karthik Kalyan], and [7d728249 · Nathan Rajlich].
- Several maintenance tasks involved updating documentation, such as the *Cookbook* for *child workflow orchestration* [c58cae66 · Karthik Kalyan] and *AI SDK integration* [ff167d00 · Karthik Kalyan], and documenting *experimental attributes* [d7f7c697 · Peter Wielander].
- The *core* module received substantial attention for reliability, error handling, and performance optimizations, including memoizing `getWritable()` to fix chunk reordering [20506560 · Peter Wielander] and addressing event cursor issues [9454151b · Peter Wielander].
Performance over time
ETV stacked by Growth, Maintenance and Fixes — 90-day moving average, normalized to ETV / month.
Average performance per developer
ETV per active developer per month — 30-day moving average.
Active developers over time
Unique developers committing each day — 90-day moving average.
Knowledge concentration
How dependent is this repo on a small number of contributors? Higher top-1 share = higher key-person risk.
Nathan Rajlich owns 35.8 % of commits.
Top contributors
Most impactful commits
Top 20 by ETV in the all-time window.
- 3.0ETVFriendlier workflow errors (consolidated) (#1849) * Introduce structured context-violation errors + Ansi renderer Phase 1: Add Ansi rendering helpers (frame, hint, note, help, code, inline) to @workflow/errors, and a chalk mock for readable snapshot tests. Phase 2: Add four context-violation error classes to @workflow/core (NotInWorkflowContextError, NotInStepContextError, NotInWorkflowOrStepContextError, UnavailableInWorkflowContextError) and apply them to all twelve user-facing throw sites so errors now include docs links and a structured "what/why/fix" frame. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address review: tighten changeset, implement ansifyName, harden Ansi - Tighten phase 1 changeset to a single sentence (per pranaygp review) and switch to double-quoted frontmatter (per Copilot + repo convention). - Implement `ansifyName` to actually apply dim styling to workflow/ / step/ prefixes; add an `Ansi.dim` helper to `@workflow/errors` so callers don't need to import chalk directly. - Remove the `void getWorkflowMetadata;` workaround in context-errors.ts by dropping the unused value import (we only needed the type and symbol). - Render the plain-Error throw in `workflow/get-workflow-metadata.ts` with `Ansi.frame` + docs link so the VM path matches the structured-class styling from the sibling step path (still uses a plain Error to avoid the module-init cycle). - Guard `buildUnderline` against zero-length markers so a stray empty token can't produce a negative `String.repeat` count. * Structured runtime logger metadata + fold in replay-timeout logging Adds a `.child()` and `.forRun(runId, workflowName)` child-logger API to the structured logger so runtime/step code doesn't have to repeat `workflowRunId`/`workflowName`/`stepId` on every call. Normalizes error metadata to structured `errorName` / `errorMessage` / `errorStack` fields instead of ad-hoc `error: err.message` strings, and adds comments to silent catches that swallow expected idempotency conflicts. Also folds in the pending changes from #1812 so that PR can be closed: - Standardize the console prefix to `[workflow-sdk]`. - Split the replay-timeout log into a warn-while-retrying vs. error-when-giving-up, and surface the underlying error when we can't mark a timed-out run as failed. - Include the error stack in the "Fatal runtime error during workflow setup" log and in the top-level user-code workflow error log so the stack surfaces in flattened log drains. - Drop the `[Workflows] "<runId>" - ` prefix from `buildWorkflowSuspensionMessage` — the structured logger now attaches run context. Supersedes #1812. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Use double-quoted changeset frontmatter per repo convention * Add SerializationError + apply to user-facing serialization sites Phase 4 of friendlier errors: introduce a `SerializationError` class with an optional `hint` and a docs link (workflow-sdk.dev/err/serialization-failed), and adopt it at every user-facing serialization boundary in @workflow/core: - Locked ReadableStream at a workflow boundary - Unregistered class / missing `classId` / missing `WORKFLOW_DESERIALIZE` - Attempting to return step functions to clients or call workflow functions directly - Webhook `respondWith()` called outside a step - `dehydrate*` / `getSerializeStream` failures (workflow args/return, step args/return, stream chunks) Internal invariants (format prefix length checks, unknown format bytes, missing `STREAM_NAME_SYMBOL`, encryption key/size guards, etc.) now throw `WorkflowRuntimeError` instead of plain `Error` so the classifier and logger treat them consistently. `formatSerializationError` now returns `{ message, hint }` so the hint fragment can be rendered with the standard SerializationError framing instead of being baked into the message string. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Use double-quoted changeset frontmatter per repo convention * Presentation-only user vs SDK error attribution Add describeError() that derives attribution and class-aware hints from existing error classes + RUN_ERROR_CODES — no event data changes. Wire into step failures, max-delivery exhaustion, run failures, and fatal setup errors so terminal logs include errorAttribution and a hint for known error types. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address review: describeError accepts precomputed errorCode + instanceof - `describeError(err, errorCode?)` now accepts an optional precomputed `RunErrorCode`. `classifyRunError(err)` only narrows to USER_ERROR / RUNTIME_ERROR, so the REPLAY_TIMEOUT and MAX_DELIVERIES_EXCEEDED branches were previously unreachable from the step / run failure log sites. Callers that know the failure category (runtime.ts for replay timeout and max-deliveries exhaustion) now pass the code in. - Context-violation checks use `instanceof` against the actual classes from context-errors.ts instead of a name-string set. Type-safe + survives class renames. - Wire the new hints through to the REPLAY_TIMEOUT and MAX_DELIVERIES_EXCEEDED log sites so those branches actually render a hint now. - 3 new tests cover the reachable code paths + precomputed-code override. - Changeset frontmatter switched to double quotes per repo convention. * Cosmetic consistency pass on remaining bare throws Internal invariants now use WorkflowRuntimeError so describeError attributes them to the SDK: missing startedAt, VM generateKey, closure-vars outside step context, ENOTSUP. defineHook().resume() formats schema validation failures as a readable list instead of a JSON blob. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Use double-quoted changeset frontmatter per repo convention * Data-driven describeRunError + expose via @workflow/core/describe-error Observability renderers read persisted run_failed / step_failed event data, not live Error instances. describeRunError takes { errorCode, errorName } and returns the same { attribution, hint } shape as describeError, so the CLI and web UI can derive user-vs-SDK framing from the event log directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Friendlier build-time errors: WorkflowBuildError class + applications Add `WorkflowBuildError` class in `@workflow/errors` with optional `hint` for an actionable next step, and apply it in `@workflow/builders` at user-facing sites: failed esbuild phases, unresolved built-in steps, and empty esbuild output now throw `WorkflowBuildError` with a hint pointing at the likely fix. Runtime invariants remain plain `Error`. * Polish friendlier-errors rendering: drop functionName leak, simplify docs link, redirect stack - Drop the readonly `functionName` param-property on context-error classes so util.inspect no longer prints a trailing `{ functionName: 'foo()' }` block. - Replace the `DocLink` ("label: https://…") shape with a plain `DocsUrl` template-literal type. Error output now renders a single clean line: `docs: https://…` (new `Ansi.docs` helper) instead of the noisier "note: Read more about foo(): https://…". - Add throw helpers (`throwNotInWorkflowContext`, etc.) that call `Error.captureStackTrace(err, stackStartFn)` on V8 engines so the top frame of the thrown error points at the user's call site instead of at the gate function inside the framework. Callers pass themselves as the boundary. - Refactor `defineHook()` (both root and `/workflow`) to use named function closures rather than `this.create`/`this.resume`, since the stack redirect relies on a stable function identity that survives destructuring. - Update context-errors.test.ts to snapshot the new `docs:` framing and to add a regression test asserting the top stack frame is the user call site. * Consolidate friendlier-errors stack: fix ANSI leak + non-retry semantics Addresses PR review feedback across the 8-phase friendlier-errors stack and fixes issues surfaced by manual testing (createHook() inside a step): - ANSI no longer leaks into .message / .stack. Context-violation errors now store plain text on .message and render the colored framed form lazily via [util.inspect.custom] / toString(). Structured logs, log drains, CBOR-serialized events, and JSON payloads no longer contain raw \x1B[...m bytes. - Context violations are now fatal. ContextViolationError sets fatal = true; FatalError.is(err) recognizes any error with a fatal: true own property. Calling createHook() from a step no longer burns three retry attempts on a guaranteed-to-fail context violation. - Ansi helpers moved to @workflow/errors/ansi subpath so imports from @workflow/errors no longer pull chalk into consumers that only want error classes (addresses reviewer VaguelySerious). - Shared redirectStackToCaller helper in packages/core/src/capture-stack.ts, used by both context-errors.ts and workflow/get-workflow-metadata.ts (addresses Copilot review on #1849). - Structured framed content: ContextViolationError now takes a structured FramedContent (title segments + detail branches) and renders plain/pretty from the same source of truth. Tightens the eight existing phase changesets to 1-2 sentences each and adds four new scoped changesets (errors-ansi-subpath, context-errors-plain-message, context-errors-fatal, capture-stack-shared) for the followup fixes, so the final changelog history stays readable. * test: update step-handler mocks for scoped forRun() logger The runtime logger now uses .forRun(runId, name, {stepId, stepName}) to attach scope context, so 409-handling log calls no longer repeat {workflowRunId, stepId} in every metadata bag — those live on the scoped logger instance. Update the mock to return itself from forRun() and tighten assertions to check both the log args (errorName/errorMessage) and the forRun() scope. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Mark SerializationError fatal + route dehydration through step-failure path SerializationError now carries readonly fatal = true. Step-return dehydration is wrapped inside the user-code try/catch so that the resulting error flows through userCodeFailed → step_failed → FatalError.is() short-circuit instead of bubbling up as HTTP 500 and triggering a queue retry loop. Retrying a step that returned a non-POJO is guaranteed to fail the same way, so this saves ~20s and 3 near- identical error blocks per serialization failure. * Add logging snapshot tests + manual-test artifacts Snapshot tests lock in the exact shape of: - describeError() payloads (attribution, errorCode, hint) for every classification — plain Error, SerializationError, context-violation, WorkflowRuntimeError, REPLAY_TIMEOUT, MAX_DELIVERIES_EXCEEDED. - The scoped-logger call signature for the two canonical runtime failure paths (fatal-bubble and hit-max-retries), so refactors of forRun() / child() metadata merging can't silently change what users see in their log drains. SerializationError now also has a direct test for readonly fatal=true + FatalError.is() recognition. pr-artifacts/ contains real log-output snapshots from running the nextjs-turbopack workbench against five error scenarios. These are reference material for reviewers and are flagged to be removed before merge. * Readable step-fatal logs: inline stack + friendly step/workflow names The step-level fatal-error log used to embed the full stack trace inside an `errorStack` string field in the metadata object, so util.inspect rendered it as a quote-escaped, line-continuation blob when the log hit the terminal — unreadable in practice. Move framing + stack into the log *message* (matching the workflow-level log in runtime.ts) and keep the metadata object compact with only the indexable structured fields (`errorAttribution`, `errorName`, `errorMessage`, `hint`, IDs). Log drains still get the same keys; humans now see a readable stack trace. Also introduce `formatStepName` / `formatWorkflowName` in `@workflow/utils` that render machine names (`step//./workflows/1_simple//add`) as `add (./workflows/1_simple)` in log framings, using the existing `parseStepName` / `parseWorkflowName` parsers. Applied to step-fatal, hit-max-retries, exceeded-max-retries, and workflow-threw log sites. Artifacts in pr-artifacts/ updated to show the new output shape, and renamed .log → .md since they're Markdown and IDE previews are nicer that way. * Opinionated pretty formatter for runtime structured-log metadata Replace util.inspect's default object dump (which quote-escapes multi-line stacks and paragraph hints into a single-line JSON-y blob) with a workflow-aware formatter that composes the entire log line into a single string passed to console.error / console.warn. Highlights of the new output: - Per-run / per-step IDs render with their parsed friendly names so users see `wrun_… · simple (./workflows/1_simple)` instead of just the raw `workflowName: 'workflow//./workflows/1_simple//simple'`. - Color-coded attribution badge (user error red / sdk error magenta) paired with the error class in bold. - Hints render as a paragraph under `hint:` rather than a backslash- `\n`-escaped string. - Drops redundant fields (errorStack always; errorMessage when it's already in the parent message) to avoid double-printing. - Unknown fields fall through as a sorted `key value` tail so we never silently drop log information. @workflow/errors/ansi gains bold/red/magenta helpers used by the formatter. The web / web-shared packages don't consume stderr — they read structured event payloads from the World event log — so this is presentation-only at the runtime layer. * ci(benchmarks): disable pnpm cache for getCommunityWorldsMatrix The job never runs `pnpm install` (it just calls `node` against a checked-in script), so the pnpm store path never exists. The post-job `actions/setup-node@v4` cache-save then fails with `Path Validation Error: Path(s) specified in the action for caching do(es) not exist` and red-X's the entire job even though the matrix step succeeded. The setup-workflow-dev composite already has a `cache-pnpm` opt-out input for this exact case — wire it through here. * Address PR review comments: inspect dedup, cause leak, retry-loop tests - ContextViolationError: util.inspect(err) duplicated every framed detail line because the stack-tail strip only sliced the first message line. V8's Error.stack reads `Name: messageLine1\n messageLine2\n at ...`, so for our multi-line `title\n╰▶ docs: …` messages every detail line was getting prepended twice (once in the pretty form, once via the unsliced message tail). Count the actual message lines and slice past all of them. Repro test asserts `╰▶ docs:` appears exactly once. - WorkflowError: stop assigning `cause: undefined` as an enumerable own property when no cause is provided. Subclasses (every error in this PR) inherit the parent constructor; the unconditional assignment polluted `util.inspect(err)` output with `{ cause: undefined, … }` on every no-cause instance. The `super(...)` call already conditionally sets `.cause` non-enumerably when `options.cause` is provided. - step-handler.test.ts: add a regression-gate suite that exercises the fatal-vs-retryable retry-loop wiring directly. Asserts that an error with `fatal: true` produces exactly one `step_failed` event with no `step_retrying`, and that a non-fatal `Error` retries via `step_retrying` on early attempts and emits `step_failed` once the retry budget is exhausted. Catches the silent-regression case where `fatal = true` is removed from a context-violation error class but the `FatalError.is()` unit tests stay green. * Consolidate changesets + remove pr-artifacts Address review feedback to drastically shorten the changesets — fold the 15 file-by-file entries into a single user-facing changeset for @workflow/core / errors / builders / utils. Also drop the pr-artifacts/ folder (reviewer-only log captures, no longer needed). * Polish runtime error logging: layout, stack trim, hint consolidation Five user-driven fixes from manual smoke-testing of #1849: 1. Logger layout. composeLogLine() now puts the structured-fields block (attribution badge, run/step IDs, error code) **between** the framing line and the stack body, instead of after it where 30+ lines of stack buried the most useful information. The framing stays at the top, stack at the bottom, structured info readable at a glance. 2. Stack trim. Drops framework-internal frames (`node_modules/.pnpm/`, `node:internal/`, Turbopack-bundled `node_modules__pnpm_*` chunks, `_next_dist_*` chunks) and caps the surviving frame count at 6 so the stack stays compact even on heavy async wrappers. Suppressed runs emit one summary line so users know the trim happened. 3. Wrapper-route noise. The nextjs-turbopack workbench's start route was catching `WorkflowRunFailedError` rejection on `Promise.race([readLoop(), run.returnValue])` and re-logging it via `console.error('Error in workflow stream:', error)` plus `controller.error(error)` — which then triggered Next.js's `⨯ failed to pipe response` overlay. The SDK already logs the failure cleanly upstream and the runId is on the response header, so the wrapper now closes the SSE stream cleanly on WorkflowRunFailedError. 4. Consistent framed `╰▶ hint:` / `╰▶ docs:` layout for all errors that carry a hint or docs slug. WorkflowError, SerializationError, and WorkflowBuildError now share one `appendFramedDetails` helper matching the box-drawing structure that ContextViolationError already used. Was: blank-line-separated `Learn more: <url>`. Now: one tree, indistinguishable from context-violation rendering. 5. Drop the duplicate logger-side `hint` field. Hints now live on the error message only — actionable hints get serialized into the event log, rehydrated on the workflow side, and shown in observability automatically. The previous logger-only hint duplicated stderr but never made it past the step boundary. Updated SerializationError hint to point at the foundations doc ("Ensure you're returning workflow serializable types. Check the serialization docs to see what's serializable: https://workflow-sdk.dev/docs/foundations/serialization") instead of the hardcoded `(plain objects, arrays, primitives, …)` list, which drifted out of sync as the supported types grew. Same hint reuses for step args, workflow args/return, stream messages, and any other site that goes through `formatSerializationError`. Also retitled the retry summary `3 retries` → `3 max retries` since "3 retries" next to "4 attempts" was ambiguous (already-happened vs. budget). * Trim error-card title + drop machine step name from persisted error - ErrorStackBlock (web observability): show just the first non-empty trimmed line of the error message in the card title with single-line truncation. Multi-line messages (`Failed to serialize step return value\n╰▶ hint: …`) were rendering the entire framed body in the title, pushing the copy button off-screen and burying the scannability of the headline. Full message stays in the body via the stack (V8 prepends `Name: message` to `Error.stack`), so no information is lost; hover-tooltip exposes the full title text. - Persisted error message: drop the `Step "step//./.../foo"` machine name from `Step failed after N retries: …` and `Step exceeded max retries (…)` strings. Observability already attributes the event to a specific step via the UI tree, and the CLI logger emits the friendly `Step foo (./...) hit max retries` framing on its own line. Embedding the raw `step//./...` machine name in the persisted message text was duplicate noise. * Update .changeset/friendlier-errors.md Co-authored-by: Peter Wielander <mittgfu@gmail.com> Signed-off-by: Pranay Prakash <pranay.gp@gmail.com> * Update .changeset/pretty-log-format.md Co-authored-by: Peter Wielander <mittgfu@gmail.com> Signed-off-by: Pranay Prakash <pranay.gp@gmail.com> * Update SerializationError snapshot tests for slug-less message The class no longer attaches a slug-based `╰▶ docs:` line — the foundations URL is embedded directly in the hint via the `formatSerializationError` helper in @workflow/core. Update the test expectations accordingly: - bare-title case is now a single line (no docs link) - hint case renders one `╰▶ hint: …` branch (no second branch) * Update serialization.test.ts hint assertions for foundations URL Four `should throw error for an unsupported type` cases were still asserting on the old hardcoded type list. Update to the new hint phrasing that points at the foundations doc, matching the change in `formatSerializationError` (`packages/core/src/serialization/errors.ts`). --------- Signed-off-by: Pranay Prakash <pranay.gp@gmail.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Peter Wielander <mittgfu@gmail.com>Pranay Prakash · 1203dae7 · 2026-05-04
- 3.0ETVWorkflows graph extractor (#455) --------- Signed-off-by: Karthik Kalyanaraman <karthik@scale3labs.com>Karthik Kalyanaraman · e3f03904 · 2025-12-27
- 2.9ETV[builders][web-shared] Improvements to o11y and fixes to graph generation code path (#1031) * [workflow o11y] rebase on latest main and keep targeted UI/builders changes Rebase the branch intent onto latest main by preserving web-shared UI refactors and builders base-builder updates while taking main for hydration and data-fetching behavior elsewhere. Co-authored-by: Cursor <cursoragent@cursor.com> * [workflow o11y] align web-shared hydration revivers with main Revert the hydration reviver delta for URL, URLSearchParams, and Headers so web-shared matches main behavior while keeping the targeted UI/builders-only scope on this branch. Co-authored-by: Cursor <cursoragent@cursor.com> * [workflow o11y] restore PR #1017 detail-panel decoupling Bring the web-shared trace/detail panel files back in sync with main so PR #1017 behavior is preserved and not regressed on this branch. Co-authored-by: Cursor <cursoragent@cursor.com> * [workflow o11y] restore PR #1018 react-inspector sidebar updates Bring web-shared o11y rendering files back in sync with main so ObjectInspector-based sidebar rendering and related UI behavior from PR #1018 remain intact on this branch. Co-authored-by: Cursor <cursoragent@cursor.com> * [workflow o11y] keep custom viewers and apply inspector rendering Preserve the branch-specific event list and stream viewer UX while applying react-inspector rendering to payload/chunk data so complex hydrated values render correctly without reverting custom UI behavior. Co-authored-by: Cursor <cursoragent@cursor.com> * [workflow o11y] trace viewer UX improvements and Geist alignment - Add live tick animation, context menu, and cancel run support from PR #984 - Decouple side panel styling to use Geist design tokens (inline styles) - Fix sleep span detail panel showing events instead of wait entity attributes - Fix stream viewer flickering by removing unstable object deps - Remove Chunks/Output toggle from stream viewer, show only chunks - Add resolve hook modal, wake-up sleep, and cancel run plumbing - Add loading skeleton for stream viewer Co-authored-by: Cursor <cursoragent@cursor.com> * Bug fixes * Bug fixes * Bug fixes * Bug fixes * Bug fixes * Bug fixes * Bug fixes * Bug fixes * Bug fixes * Bug fixes --------- Co-authored-by: Cursor <cursoragent@cursor.com>Karthik Kalyan · 1c115734 · 2026-02-13
- 2.9ETVfix(world-local,world-postgres): make duplicate hook_created idempotent (#2295) * fix(world-local): make duplicate hook_created idempotent Duplicate processing of the same hook_created — same runId, hookId, and token, e.g. cross-process replay or queue redelivery — was being recorded as a hook_conflict in the event log, which then replayed as a self- conflict HookConflictError. The fix mirrors the existing step_created duplicate-correlation path: when the exclusive token claim fails and the existing claim has the same (runId, hookId), throw EntityConflictError so the runtime's existing concurrent-replay catch path swallows it. Different runId or hookId reusing the same token still produces a real hook_conflict. The persisted token claim already carried hookId; only the read schema was dropping it. The schema now preserves hookId (marked optional for backward compatibility with older claim files). Fixes #2283 * fix(world-postgres): make duplicate hook_created idempotent world-postgres has the same gap as world-local was just fixed for: the duplicate-token check in events.create unconditionally writes a hook_conflict event when an existing hook with the same token is found, even when the existing hook has the same (runId, hookId) as the incoming event. The unique partial index on workflow_events does not catch this because the duplicate path inserts hook_conflict, not hook_created. Mirror the world-local fix: when the existing hook's (runId, hookId) matches the incoming event, throw EntityConflictError so the runtime's existing concurrent-replay catch path swallows it. Different runId or hookId reusing the same token still produces a real hook_conflict. Refs #2283 * test(e2e): add regression test for hook_conflict from same-tick replay race Regression test for #1665 / #2283. A parent workflow awaits 6 child workflows with Promise.all; each child does a tiny step and creates one webhook. Awaited children flatten into the parent run, so all webhook creations land on the same workflow body. When their step resolutions align in the same tick the workflow body is re-walked and each pass submits hook_created with the same deterministic (correlationId, token). Before the world-side idempotency fix, the world wrote hook_conflict events for the duplicates and the workflow failed with HookConflictError. With the fix, duplicates throw EntityConflictError (swallowed by the suspension handler), no hook_conflict events appear in the log, and the webhooks resolve normally. Verified locally against world-local: the test fails reliably (3/3) on the unfixed code and passes reliably (5/5) on the fixed code. * test(e2e): rewrite parallelStepsThenWebhookWorkflow to match the actual #1665 repro The earlier version invoked another 'use workflow' function directly from inside the parent workflow, which is not a valid child-workflow invocation (child workflows must be spawned via start()) and didn't mirror the bug shape on #1665 anyway. Rewrite the workflow as a single 'use workflow' function that exactly mirrors Paolo's minimal repro: await Promise.all([stepA(), stepB()]); using webhook = createWebhook(); await webhook; The for-loop runs N independent iterations of that sequence in series, each disposing its webhook via 'using' before the next, to give the timing-sensitive race multiple chances to fire. The race is hard to force deterministically on fast local dev — but the same (runId, hookId) idempotency invariant is covered deterministically by the new unit tests in world-local and world-postgres. This e2e test serves as a higher-level regression net: its assertions (no hook_conflict event in the log, no HookConflictError-failed run) are correct whether the race fires or not, and will catch any future regression on a run that does hit it. * fix(world-local,world-postgres): recover crash-orphaned hook claims/rows instead of suppressing the retry Addresses review feedback on PR #2295. The original idempotency fix made duplicate same-(runId, hookId) hook_created submissions throw EntityConflictError so the suspension handler's concurrent-replay catch path swallows them. But the claim file (world-local) and hook row (world-postgres) are written before the durable hook_created event, and the writes are not atomic. A process / DB interruption between the claim/hook write and the event write leaves an orphaned claim/hook row; the retry then matched the same (runId, hookId), threw EntityConflictError, got swallowed, and the run was permanently left with no hook_created event in the log. world-local: - Add a per-(runId, hookId) in-process mutex (withHookLock) mirroring the existing withStepLock, so two same-tick concurrent calls serialize on the entity write and the dedup branch never observes an in-flight winner mid-write. - In the dedup branch, when the existing claim is for the same (runId, hookId) we are trying to create, check whether the durable hook entity actually exists on disk: - exists → real duplicate: throw EntityConflictError as before. - missing → orphaned claim from a prior crash: fall through and complete the partial write (write the hook entity with overwrite, then emit hook_created via the outer code path). world-postgres: - In the dedup branch, when the existing hook row matches the incoming (runId, hookId), check whether a hook_created event for this (runId, correlationId) already exists in the event log: - exists → real duplicate: throw EntityConflictError as before. - missing → orphaned hook row from a prior crash between hook INSERT and events INSERT: skip the hook insert (the row is already there) and let the outer code path emit hook_created, completing the partial write. Tests: - world-local: pre-seed an orphaned token claim with no matching hook entity, retry hook_created, assert hook entity and hook_created event both land (no hook_conflict, no EntityConflictError). - world-postgres: pre-seed an orphaned hook row with no matching hook_created event, retry, assert hook_created event lands (no hook_conflict, no EntityConflictError). Both tests fail on the prior implementation (EntityConflictError thrown on retry, exact symptom from the review). * fix(world-local): probe the event log (not the hook entity) to detect duplicate hook_created Addresses follow-up review on PR #2295. The previous dedup branch checked whether the durable hook entity existed on disk. But the hook entity is written before the `hook_created` event, and the two writes are not atomic, so a crash between them leaves both the claim file and the hook entity on disk with no event in the log. The dedup branch then matched on `(runId, hookId)`, found the hook entity, threw EntityConflictError, and the suspension handler swallowed the retry — permanently losing `hook_created` from the event log. The fix mirrors what the world-postgres branch already does: probe the run's event log for an existing `hook_created` event for the same `(runId, correlationId)`. The event is the durable record of a successful hook creation; the claim file and hook entity are partial- write artifacts that may exist without the event. - exists → real duplicate: throw EntityConflictError so the runtime's concurrent-replay catch path swallows it. - missing → orphaned partial write (crash at any point before the event landed): re-write the hook entity (with overwrite: true, in case a stale partial copy exists) and let the outer code path emit the hook_created event. Added a new helper findHookCreatedEvent that runs a filtered paginatedFileSystemQuery with limit:1 over the run's events. Regression test "should recover an orphaned hook entity with no matching hook_created event" added — pre-creates a hook, deletes just the hook_created event from disk to simulate a crash between the entity write and the event write, asserts the retry emits a fresh hook_created event (no hook_conflict, no swallowed EntityConflictError). I verified this test fails on the prior fix (throws `EntityConflictError: Hook "hook_orphan_entity_1" already created`, exactly as pranaygp reported) and passes on this commit. The previous test ("should recover an orphaned hook token claim with no matching hook entity") continues to pass — the event-log probe is a strict superset of the entity probe, since a missing entity always also implies a missing event. * fix(world-local): converge same-hook creation across workers via canonical eventId Addresses follow-up review on PR #2295. The previous fix made the dedup branch probe the event log to decide real-duplicate vs orphan-recovery, but the probe and the recovery write are not a single atomic operation. Two workers sharing a data directory (or two retries that lose `writeExclusive(constraintPath)` back to back) could both pass the probe (each observing no hook_created event yet), both fall through to the recovery write, and both append a hook_created event with a different eventId — producing two events in the log for the same (runId, hookId). The in-process `withHookLock` mutex does not help here because it is process-local and tag-specific. The fix persists `eventId` in the durable token claim file (written by the original `writeExclusive(constraintPath)`). On a same-(runId, hookId) dedup match, retries adopt that canonical eventId and rebuild the event with a deterministic createdAt derived from the eventId (a ULID). The outer event write switches from `writeJSON` (check-then-write, TOCTOU) to `writeExclusive` (O_CREAT|O_EXCL via temp-file + hard-link, atomic across processes). Either worker may win the publish; the other throws EntityConflictError which the runtime's existing concurrent-replay catch path swallows. Net result: exactly one hook_created event per logical creation. Backward compatibility: a claim file written before this commit lacks `eventId`. Retries that read such a claim fall back to the event-log probe + fresh-eventId recovery — the legacy behavior that does not converge across workers but cannot regress for freshly- written claims after upgrade. world-postgres already converges across workers via the partial unique index on workflow_events_entity_creation_unique (runId+correlationId+eventType for hook/step/wait_created): the loser's INSERT raises 23505 which is already translated to EntityConflictError. Regression tests: - world-local: `converges same-hook creation across workers to one event` uses two tagged storage instances sharing one data directory and fires 25 paired Promise.allSettled hook_created calls. Expected 25 hook_created events total; before this fix yielded 50. - world-postgres: `converges same-hook creation across concurrent calls to one event` exercises the same shape against the real Postgres unique index. Already converges; the test is a guard against future regressions to the catch path. Verified the world-local test fails on c7b23e1b5 with exactly the shape pranaygp reported (50 events for 25 logical creations) and passes on this commit. The earlier orphaned-claim and orphaned- entity recovery tests also continue to pass. * fix(world-local): converge legacy hook claims via recovery-marker sidecar; replace tag-proxy test with real subprocess workers Addresses follow-up review on PR #2295. Two distinct issues, both flagged by pranaygp as P1: 1. The fallback path for token claims written by versions before eventId was persisted inline (legacy claims after upgrade) still permitted the same cross-process corruption the inline fast path was fixed to prevent. Two processes both reading a legacy claim each generated their own eventId, landed their writeExclusive(eventPath) calls at different paths, and appended two hook_created events for the same (runId, hookId). Existing persisted claims after a real upgrade are exactly the state the crash-recovery branch needs to repair, so leaving the legacy path non-convergent is silent corruption, not backward compatibility. 2. The committed cross-worker convergence test used two tagged storage instances sharing one directory as a proxy for separate processes. But tags change the destination filename (events/wrun_X-evnt_Y.worker-a.json vs ...worker-b.json), so two tagged workers can each writeExclusive their own event at different paths and both fulfill. The Map-by-eventId deduplication in the assertion then masked the duplicate publication, so the test passed for the wrong reason. Implementation: - New HookRecoveryMarkerSchema (`{ eventId, hookId, runId }`) and HookRecoveryMarkerPath helper. The marker is a sidecar at hooks/tokens/<hash>.recovery.json, written via writeExclusive so the first cross-process retry pins its candidate eventId as canonical; subsequent retries read the marker and adopt that eventId. Together with the existing writeExclusive(eventPath) in the outer publish, this gives the legacy-fallback path the same single-event convergence guarantee as the inline-eventId fast path. - pinCanonicalEventIdForLegacyClaim() encapsulates the marker write-or-read. A stale marker for a different (runId, hookId) (token-reuse with leaked state) is overwritten best-effort — the common cross-worker race for the same hook still converges; only the narrow stale-token-reuse case loses convergence. - hook_disposed now also deletes the recovery marker when it deletes the token constraint file, preventing a future legacy recovery for a recycled token from latching onto a stale eventId. - The dedup branch unified: existingClaim.eventId for new claims, pinCanonicalEventIdForLegacyClaim() for legacy ones. Removed the now-redundant findHookCreatedEvent helper — the writeExclusive(eventPath) in the outer publish is the authoritative duplicate-vs-orphan detector. Tests: - New test fixture test-fixtures/hook-race-worker.ts (TypeScript, run via child_process.fork with tsx as execPath — tsx is a transitive dev dep via vitest). Each subprocess gets its own createStorage(testDir) so the in-process hookLocks Map cannot serialize across workers. - Replaced the tag-proxy test with "converges same-hook creation across separate OS processes to one event". Spawns workerCount subprocesses, releases them from a barrier into the same hook_created, asserts exactly one fulfilled + (N-1) rejected with EntityConflictError, and asserts directly on the raw events.list() result (no Map dedup) that the number of hook_created entries equals the number of logical creations. - Added "converges same-hook creation across processes when only a legacy token claim exists". Same shape, but pre-seeds the legacy claim format (`{ token, hookId, runId }` with no eventId) before each race. Verified to FAIL on 7ce66551b (both subprocesses fulfill, no convergence) and pass on this commit. - Also verified the new-eventId subprocess test FAILS when the event write is reverted to writeJSON (TOCTOU), confirming it exercises the writeExclusive-based cross-process arbitration. Both prior orphaned-claim / orphaned-entity recovery tests also continue to pass. * fix(world-local): per-lifetime recovery markers, restore event-log probe, fix CI tsx resolution Addresses three P1 review comments on PR #2295. 1. Stale recovery marker leaking across token-reuse lifetimes (pranaygp): The previous marker path used `hashToken(token)` so a stale marker for run A could leak into run B's recovery when the same token was reused after run A terminated through normal lifecycle. `deleteAllHooksForRun()` and tagged `world.clear()` deleted the token constraint and hook entity but NOT the marker sidecar, so the next legacy claim on the same token entered the stale-marker overwrite branch and the workers overwrote it non-atomically, yielding divergent publication. Fix: - Marker path now hashes `(token, runId, hookId)` together (`hookRecoveryMarkerPath` in storage/helpers.ts). Different lifetimes can never share a marker, so the stale-marker overwrite branch is removed entirely. - `hookRecoveryMarkerPath` is moved to helpers.ts and shared across events-storage.ts, hooks-storage.ts, and index.ts. - `deleteAllHooksForRun()` and tagged `world.clear()` now also delete the recovery marker for each hook (disk hygiene; per- lifetime identity makes leaks no longer corrupting). - `hook_disposed` now uses the new per-lifetime marker path too. 2. Duplicate `hook_created` event when a legacy claim's event was already published (VADE bot, also implied by pranaygp's analysis): Removing the event-log probe from the legacy fallback let a post- upgrade retry pin a new canonical eventId via the marker and publish a duplicate event at that path, even when the original pre-upgrade writer had already successfully published the event with its own (different) eventId. Fix: - Restore `findExistingHookCreatedEventId()` (renamed and made to return the eventId for clearer semantics). - Legacy fallback now probes the event log BEFORE pinning the marker; if a matching `hook_created` event already exists, throw `EntityConflictError` so the runtime's concurrent-replay catch path swallows the retry. - Inline-`eventId` fast path does NOT need the probe — the claim itself is the durable convergence key. 3. CI failure: tsx not resolvable under pnpm isolated linking (pranaygp; confirmed by ubuntu/windows unit test 60s timeouts): The previous test hard-coded `node_modules/.bin/tsx` assuming tsx would be hoisted there. But tsx was only a transitive peer dep via vitest, and pnpm's isolated linking does NOT link transitive peer deps into the workspace bin after a fresh install — so neither root nor package-local `.bin/tsx` existed in CI, the subprocess fork never started, and the barrier hung until vitest killed the test. Fix: - Add `tsx` as a direct `devDependency` of `@workflow/world- local` (pinned to 4.20.6 to match the existing transitive resolution). - Resolve via `import.meta.resolve('tsx/package.json')` and read the `bin` field dynamically, so we adapt to wherever pnpm links tsx for this package — not a hard-coded layout. - Lazy-init the resolver (no module-load IIFE) so an absent tsx fails only the convergence tests, not all 376 tests in the file. - Surface a clear error message if resolution fails, calling out the cause (transitive vs direct deps) for future readers. Also: harden the barrier helper so `error` events and pre-ready exits resolve BOTH `readyPromises` and `donePromises`, then `SIGKILL` siblings. Previously a broken child only resolved `donePromises`, leaving `Promise.all(readyPromises)` pending until the per-test timeout (60s in CI). Regression tests added: - `legacy claim whose hook_created event was already published does not append a duplicate event` — pre-seeds a legacy claim AND a pre-existing `hook_created` event with a different eventId, asserts the retry throws EntityConflictError and the log still has exactly the original event. - `converges legacy claim recovery across run lifetimes after token reuse` — runs pranaygp's full lifecycle path: race subprocess workers on run A's legacy claim, terminate run A via `run_completed` (triggers `deleteAllHooksForRun`), reuse the token in a legacy claim for run B, race subprocess workers again, asserts exactly one fulfillment + one `EntityConflictError` per race and exactly one `hook_created` event per run. Both new tests verified to fail on 2c673e436 (after rebuilding): the published-event test throws via duplicate publish instead of EntityConflictError, the token-reuse test sees both run B workers fulfill (2 events instead of 1). The existing orphaned-claim and orphaned-entity recovery tests also continue to pass. CI loop confirmed to be repaired locally by spawning subprocesses via the new resolver and intentionally breaking the worker fixture to verify the helper fails fast (~500ms) instead of hanging at the barrier. * fix(world-local): defer hook entity write until event publish commits Addresses karthikscale3's P1 review comment on PR #2295. The dedup-recovery path used to write the hook entity BEFORE the outer event publish proved whether the attempt was repairing a missing event or just colliding with an already-published `hook_created`. For already-committed duplicates, the event write then throws `EntityConflictError`, but the hook entity had already been overwritten with the retry's payload — leaving the durable hook entity and the event log inconsistent (e.g. the entity reflects the retry's metadata while the event still carries the original). karthikscale3 reproduced this on the prior head by creating `hook_created` with metadata `{ v: "a" }`, then retrying the same `(runId, hookId, token)` with metadata `{ v: "b" }` and `isWebhook: false`: the retry threw `EntityConflictError` but `hooks.get()` returned the retry's payload. Fix: defer the hook entity write until AFTER the outer `writeExclusive(eventPath)` commits. The branch now only captures the entity-to-write and its overwrite options; the actual write happens immediately after the event publish in the shared trailing block. A retry that ends in `EntityConflictError` (the event was already published) now leaves the entity untouched. The first-writer happy path and all recovery paths (orphaned- claim, orphaned-entity, cross-worker convergence, legacy claim, token-reuse across lifetimes) are unaffected — they all reach the event publish successfully, then the entity write runs as before. Regression test `does not mutate an already-committed hook entity when a duplicate hook_created retry collides` added to world-local: runs karthikscale3's exact scenario and asserts the persisted entity still carries the original metadata and isWebhook. Verified to fail on the prior commit (persisted metadata = 0xbb instead of 0xaa) and pass on this commit after rebuilding. Parallel guard test `does not mutate an already-committed hook entity when a duplicate hook_created retry collides` added to world-postgres. Postgres already protected this via `onConflictDoNothing()` on the hook INSERT, but the test guards against a future regression that adds an UPDATE/UPSERT to the dedup path. * refactor(world-local): per-instance in-process locks; drop tsx subprocess test plumbing You were right that the tsx subprocess machinery was overkill for a storage-level convergence test. Replaced with a simple two-instance in-process test that exercises the same cross-process semantics without spawning anything. The trick: `stepLocks` and `hookLocks` were module-level Maps shared by all `createEventsStorage` calls in the same process. Move them inside the function so each `createStorage(dir)` call gets its own lock map. Two storage instances sharing one data directory then behave exactly like two separate OS processes: - independent in-process `hookLocks` Maps (no in-process serialization between them), and - a shared filesystem (so the on-disk `writeExclusive` claim / marker / event publish primitives are the only thing arbitrating convergence). This is also a real architectural improvement — the global lock map was always a leaky abstraction that made unit-test simulation of the cross-process path awkward. Changes: - `stepLocks` and `hookLocks` moved from module scope into `createEventsStorage`. `withStepLock` and `withHookLock` wrappers collapsed into direct `withInProcessLock(map, key, fn)` calls at the two call sites that need them. - The three convergence regression tests in `storage.test.ts` now use `const workerA = createStorage(testDir); const workerB = createStorage(testDir);` and race `Promise.allSettled` of `events.create` from both — no subprocess, no IPC, no barrier helper, no `raceHookCreatedAcrossProcesses`. Same assertions (exactly one fulfillment + N-1 `EntityConflictError` per race, raw `events.list()` shows exactly one `hook_created` per logical creation — no Map dedup) so the regression catches are identical. - Removed: `tsx` devDep, `test-fixtures/hook-race-worker.ts`, `HOOK_RACE_WORKER` / `resolveTsxLoaderUrl` / `TSX_BIN` / `raceHookCreatedAcrossProcesses` and the `fork`/`fileURLToPath` imports they pulled in. Verified (after rebuilding world-local): - All 379 tests pass on macOS in ~1s (was ~6.7s with subprocesses). - Convergence tests confirmed to still catch the bugs: temporarily reverted the `eventId = canonicalEventId` adoption → both workers fulfilled (2 events instead of 1). Temporarily reverted the legacy-claim marker pin → same: both workers fulfilled. - No subprocess machinery means no Windows-specific quirks (cli.mjs shebang, .cmd wrappers, .bin hoisting under pnpm isolated linking, etc.) that produced the Windows CI 60s timeouts. - World-postgres still has its own parallel guard test for the karthikscale3 "no-mutate-on-duplicate" regression; that one exercises real DB concurrency and is unaffected by this change. Full repo `pnpm test` (43 packages) and the `parallelStepsThenWebhookWorkflow` e2e test against world-local both green. * fix(world-local): repair event-first hook orphans from the persisted event; skip #1665 e2e on world-postgres - A crash between the hook_created event publish and the deferred hook entity write left the event committed with the entity missing and unrepairable (retries threw EntityConflictError without materializing the entity). Retries now rebuild the entity from the PERSISTED event's payload — never the retry's eventData — via a race-safe writeExclusive, on both the canonical-eventId collision path and the legacy-claim probe path. - Skip parallelStepsThenWebhookWorkflow e2e on world-postgres: the same-tick replay pattern surfaces a separate pre-existing step_started ordering bug there (#2331). --------- Co-authored-by: Peter Wielander <peter.wielander@vercel.com>Nathan Rajlich · f2a7bdeb · 2026-06-11
- 2.6ETVfix(swc-plugin): closure variable detection for `new` expressions and module-level declarations (#1368) * fix(swc-plugin): closure variable detection for `new` expressions and module-level declarations Fix two SWC compiler plugin bugs related to closure variable detection: 1. Add Expr::New handling to ClosureVariableCollector so `new Class(...args)` properly captures both the callee and arguments as closure variables. 2. Exclude module-level declarations (functions, variables, classes) from closure variable detection, preventing over-capturing of identifiers that are already available in all bundles. This also allows DCE to properly remove step-only helpers and their imports from the workflow bundle. Fixes #1365 * fix(swc-plugin): handle additional expression/statement types in closure variable collector Expand closure variable detection to cover more AST patterns: Expressions: Seq (comma), Yield, OptChain, Prop::Shorthand, computed property keys, Prop::Assign defaults, Class (skip bodies) Statements: Throw, Try/Catch/Finally, Switch, ForIn, ForOf, DoWhile, Labeled Also fix existing Prop::Shorthand bug where object shorthand properties like { url } were not being collected as closure vars. Extend test fixture with cases for all newly handled patterns. Fix spec.md wording per review feedback. * fix(swc-plugin): preserve original step function bodies in enclosing functions In step mode, nested step functions were replaced with bare references to the hoisted copy (e.g., `return hoisted$fn;`). This broke direct calls because the hoisted copy uses `__private_getClosureVars()` which only works in workflow context. Now the original function body is preserved inline with just the directive stripped, so JavaScript's normal closure semantics work for direct calls. The hoisted copy with `__private_getClosureVars()` is still registered for workflow-driven execution. Fixes #1369 * fix(swc-plugin): restore metadata tracking for object property steps in step mode The previous commit accidentally removed the object_property_workflow_conversions tracking from the step mode path, causing __internal_workflows metadata to be stripped from step bundle output for object property step functions. * fix(swc-plugin): detect closure variables inside nested function/method bodies The closure variable collector was skipping nested function expressions, arrow functions, and method bodies entirely. This meant closure variables used deep inside inner functions (e.g., a variable used inside a ReadableStream's start() method) were not captured. Now the collector walks into nested function/arrow/method/getter/setter bodies while adding their parameters to the local var set, so only truly free variables from the outer step scope are captured. Also add ReadableStream, WritableStream, TransformStream, and other common Web API globals to the known globals list. * update changeset to include Bug 4 * fix(swc-plugin): handle TypeScript expression wrappers and class bodies in closure detection After comparing with Next.js's SWC plugin closure detection approach, identified and fixed remaining gaps: - TypeScript expression wrappers (as, satisfies, !, type assertions, const assertions, instantiation expressions) now traverse to the inner expression instead of being silently skipped - Class expressions and declarations now walk their body members (methods, properties, constructors, static blocks) to detect closure variables used inside them - Document all remaining safe-to-skip Expr variants (This, Lit, SuperProp, MetaProp, PrivateName, Invalid, JSX) * test: add fixture cases for TypeScript wrappers and class body closure detection * test: add TypeScript fixture for closure detection through TS expression wrappers Add a proper input.ts fixture that tests closure variable detection through real TypeScript syntax: `as`, `satisfies`, `!` (non-null), angle-bracket type assertions, `as const`, and generic function calls. Update test harness to support input.ts files by adding swc_ecma_parser dev-dependency and auto-detecting TypeScript syntax from file extension. Remove the incorrectly placed TypeScript-related test cases from the JS fixture (they were using plain JS syntax, not actual TS wrappers).Nathan Rajlich · 5d95abf9 · 2026-03-16
- 2.5ETVAdd workflow CFG visualization to observability UI (#456) --------- Signed-off-by: Karthik Kalyanaraman <karthik@scale3labs.com> Co-authored-by: Peter Wielander <mittgfu@gmail.com>Karthik Kalyanaraman · 4aecb999 · 2025-12-05
- 2.4ETVInline all SWC plugin step registrations, remove workflow/internal/private (#1632) The SWC compiler plugin no longer generates import statements. All step function registrations and closure variable access are now self-contained inline IIFEs with zero module dependenciesNathan Rajlich · 0a86de3a · 2026-04-08
- 2.1ETVChange compiler ID generation logic to use Node.js import specifier (#899) ## Summary This PR changes how the SWC compiler generates IDs for workflows, steps, and classes. Instead of using raw file paths, IDs are now based on **Node.js module specifiers** when the file belongs to a package (either in `node_modules` or a workspace package). ## Motivation Previously, IDs were generated using file paths like `step//src/jobs/order.ts//fetchData`. This caused several issues: 1. **Package exports conditions**: When a package uses conditional exports (e.g., `"workflow"` vs `"default"` conditions in `package.json`), the same import specifier can resolve to different files. Using file paths meant IDs could differ based on which export condition was used. 2. **Cross-bundle consistency**: Classes serialized in one bundle couldn't be deserialized in another if the file paths differed. 3. **Version tracking**: No way to include package versions in IDs for cache invalidation. ## Changes ### New ID Format IDs now use the format `{type}//{modulePath}//{identifier}` where `modulePath` is either: - A **module specifier** like `point@0.0.1` or `@myorg/shared@1.2.3` for package files - A **relative path** prefixed with `./` like `./src/jobs/order` for local app files Examples: - `step//workflow@4.0.1-beta.50//fetch` (SDK step) - `step//./workflows/order//processOrder` (local step) - `class//point@0.0.1//Point` (package class) - `class//./src/models/User//User` (local class) ### New Module Specifier Resolution Added `packages/builders/src/module-specifier.ts` which: - Detects if a file is in `node_modules` or a workspace package - Finds the nearest `package.json` and extracts name/version - Returns the module specifier for the SWC plugin to use ### SWC Plugin Changes - Added `moduleSpecifier` option to plugin config - Updated `naming.rs` to support both module specifiers and relative paths - Added `get_module_path()` helper that uses specifier when available, falls back to `./filename` format ### Special Cases - **Builtin functions** (`__builtin_*`): Continue to use just the function name as the ID for stable, version-independent lookup from the workflow VM runtime. ## Testing - Updated all 125+ SWC plugin test fixtures to use new ID format - Added tests for module specifier resolution - Added tests for Windows path normalization in naming ## Breaking Changes This is technically a breaking change for any persisted workflow runs that reference the old ID format. However, since IDs are internal implementation details and not user-facing, this should not affect end users. ## Files Changed - `packages/builders/src/module-specifier.ts` - **NEW**: Module specifier resolution logic - `packages/builders/src/apply-swc-transform.ts` - Pass module specifier to SWC plugin - `packages/builders/src/base-builder.ts` - Use `getImportPath` for virtual entry imports - `packages/swc-plugin-workflow/transform/src/lib.rs` - Accept and use module specifier - `packages/swc-plugin-workflow/transform/src/naming.rs` - New ID formatting with module paths - `packages/swc-plugin-workflow/spec.md` - Updated documentation - `packages/core/e2e/e2e.test.ts` - Updated test assertions for new ID formatNathan Rajlich · 73bf7be9 · 2026-02-04
- 2.0ETVSerialize `run_failed`/`step_failed` errors through serialization pipeline (#1851) * Serialize run_failed/step_failed errors through serialization pipeline Switch run_failed, step_failed, and step_retrying events to persist the full thrown value via the workflow serialization pipeline (as SerializedData / Uint8Array) instead of a lossy { message, stack, code } StructuredError shape. Consumers hydrate via hydrateRunError / hydrateStepError to reconstruct the original thrown value, preserving Error subclass identity, cause chains, and custom properties. - WorkflowRun.error and Step.error are now SerializedData - WorkflowRun gains a top-level errorCode plaintext field - WorkflowRunFailedError.cause is now the hydrated thrown value - Adds world-postgres migration 0010_add_error_code.sql - Legacy pre-pipeline errorJson records surface as undefined on read * Update Next.js workbenches for new WorkflowRunFailedError.cause type cause is now `unknown` (the hydrated thrown value) rather than `Error & { code }`. Defensively extract Error-shaped fields when the hydrated value is an Error, otherwise round-trip the raw value, and expose the new `errorCode` classification field. * Update docs for WorkflowRunFailedError.cause: unknown The hydrated `cause` is now `unknown` (the original thrown value through the serialization pipeline) and the error classification has moved to the top-level `errorCode` property. Update the two affected docs pages and the `TSDoc` interface to reflect the new shape, and narrow `cause` with `instanceof Error` before accessing fields. * Expand test coverage for the run/step error serialization pipeline Unit tests: - 19 new dehydrate/hydrate{Step,Run}Error round-trip tests covering FatalError, plain Error, built-in Error subclasses, non-Error thrown values (string, plain object), cause chains, encryption round-trip, the binary format prefix contract, and the unserializable / unknown- format error paths. - 5 new tests for Run.returnValue when the run is failed: hydrated FatalError + cause as cause, plain Error preservation, non-Error thrown values surfaced verbatim, cross-class cause chains, and the hydration-failure fallback that still surfaces errorCode. E2E tests (new, in 99_e2e.ts + e2e.test.ts): - Step throw → workflow catch round-trips a FatalError with a TypeError cause chain, asserting class identity, fatal marker, and cause name + message all survive the step_failed event pipeline. - Workflow throw → run_failed reaches status with the new top-level errorCode metadata exposed (cause-shape coverage lives at the unit level, since the SWC plugin's class registration is not invoked in the plain-Node e2e runner). - Workflow throw of a non-Error value round-trips that value verbatim as WorkflowRunFailedError.cause. Adjustments to existing assertions: - error.cause is now ; tests narrow with and use the new top-level field instead of . - step.error / run.error from CLI --withData are now hydrated payloads: unregistered class instances surface as Instance refs whose carries the original message + stack. Observability hydration: - hydrateStepIO / hydrateWorkflowIO in serialization-format.ts now hydrate the field via hydrateData, so the CLI and web UI continue to surface readable run/step error messages and stacks. * Tighten error serialization changeset description * Trim error serialization changeset to a single sentence * Resolve FatalError/RetryableError revivers via cross-realm registry When a workflow runs in a Node `vm` context, its bundled `@workflow/errors` is a different module instance than the host's import (separate prototype chains, separate class identity). Calling `new FatalError(...)` from the host-side reviver produces a host-realm instance that fails `err instanceof FatalError` checks in the workflow code — even when the serialized payload was correctly tagged via the dedicated `FatalError` reducer. Surfaced by the local-prod e2e "step throw round-trips FatalError" test on Next.js Turbopack: each route gets its own bundled chunk, so the flow handler's `@workflow/errors` and the workflow VM bundle's `@workflow/errors` are two distinct copies of the same module. Fix: - Each bundled copy of `@workflow/errors` self-registers its `FatalError` and `RetryableError` classes on `globalThis` via `Symbol.for("@workflow/errors//FatalError")` / `Symbol.for("@workflow/errors//RetryableError")`. First load wins per realm; the descriptor is non-writable / non-configurable to make accidental clobbering loud. - The revivers in `@workflow/core`'s common reducers module read the consumer's `globalThis` (passed in as `global`) to pick up the realm-local class, falling back to the host-imported class when no registration is present (e.g. in the CLI / test runner). * Use `types.isNativeError` to remap workflow stacks across VM realms The runtime's run-failure path computes a source-map-remapped stack and then assigns it back onto the thrown value via `if (err instanceof Error) err.stack = errorStack`. Workflows run inside a Node `vm` context, so a workflow-thrown error is an instance of the VM realm's `Error` — `instanceof` against the host realm's `Error` returns `false`, the assignment is skipped, and the serialized `run_failed` event carries the un-remapped (bundled-line- number) stack instead of the source-mapped one. Switch the gate to `types.isNativeError`, which uses V8's internal type tag and works across realms — same approach already in place for the serialization reducers. Caught by the local-prod e2e "nested function calls preserve message and stack trace" and "cross-file imports preserve message and stack trace" tests, which assert that the persisted run-error stack contains `99_e2e.ts` / `helpers.ts`. * Sync CLI revivers with core + add toJSON shim for Error subclasses Two issues with the CLI's hand-rolled reviver list: 1. It hadn't been updated for the new first-class Error subclass reducers (`TypeError`, `RangeError`, `FatalError`, `RetryableError`, etc.). devalue throws "Unknown type X" when it encounters a reduced value with no matching reviver, and `hydrateResourceIO` swallows that error and surfaces the raw `Uint8Array` payload — so `step.error` / `run.error` showed up as raw byte dumps in `workflow inspect` output. 2. Even with all the right revivers, `Error.prototype`'s `message` / `stack` / `cause` are non-enumerable, so `JSON.stringify` (used by `workflow inspect --json`) drops them — leaving the subclass-specific enumerable fields (e.g. `FatalError.fatal`) visible but the actual error data missing. Fix: - Build the CLI reviver set on top of `getCommonRevivers()` from `@workflow/core` so the CLI stays in sync with the runtime's reducer set automatically. New core reducers/revivers will Just Work without any CLI-side change. - Wrap each Error reviver from the common set with a thin shim that attaches a non-enumerable `toJSON` method to the produced `Error` instance. `JSON.stringify` calls `toJSON` and gets a full object (`name` + `message` + `stack` + `cause` + any enumerable subclass fields like `fatal` / `retryAfter` / `errors`); `util.inspect` ignores `toJSON` and renders the canonical `Error: msg\\n at ...` format. Best of both worlds for CLI output without compromising the runtime hydration path. Caught by the local-prod e2e "basic step error preserves" and "cross-file step error preserves" tests, which read `failedStep.error.message` / `.stack` from the CLI's JSON output. * Clarify parseErrorJson JSDoc to match its always-null return The previous JSDoc described preserving legacy values "for best-effort hydration" which contradicted the implementation, where legacy errors are intentionally surfaced as absent (the pre-pipeline shapes can't be hydrated by the new error revivers). Rewrite the comment so the contract matches behavior. Also rename the now-unused parameter to `_errorJson` to reflect that the function ignores it. Caught by a code review on #1851. * Refine error-handler ergonomics on the step / run hot paths Three review-driven adjustments that all touch the queue handlers and their interaction with the error serialization pipeline: 1. Memoize the per-run encryption key fetch. The step handler used to eagerly fetch + import the key at the top of every step delivery so the value would be in scope for every potential dehydrateStepError path. That pessimized step-started early-return cases (the fetch happens unconditionally even when the step never reaches user code) and required duplicating the same boilerplate at four call sites in runtime.ts. Introduce `memoizeEncryptionKey(world, run)` in runtime/helpers.ts that returns a lazy, single-fetch accessor; step-handler / runtime call sites use `await getEncryptionKey()` instead. The first caller pays the fetch cost, subsequent callers await the cached promise, and steps that fail before any encryption-aware work happens skip the fetch entirely. 2. Preserve the prior attempt's serialized error as the cause on the defensive max-retries-exceeded `step_failed` re-invocation guard. The existing comment explicitly opted out of cause attachment, but the symmetric post-failure path below already does this and the reviewer is right that consumers shouldn't have to walk the step_retrying event history to recover the underlying error. Best- effort: if hydration of the prior `step.error` throws, fall back to a FatalError without cause rather than letting the event write itself fail. 3. Document the intentional `unflatten` throw in `hydrateStepError` / `hydrateRunError` for non-Uint8Array input. SDK version is pinned per workflow run via skew protection so the non-binary branch is dead in production; if a misshapen value reaches it, surfacing the throw via the surrounding o11y try/catch is more debuggable than masking it. Add a comment so future reviewers don't reach for a defensive fallback. A standalone `falls back to plaintext` suggestion on the run_failed key fetch was rejected: when encryption is configured we should fail loudly rather than silently emit plaintext error data. The queue's redelivery semantics will retry the key fetch; persistent KMS outages get logged with the existing "persistent error preventing the run from being terminated" message rather than a security regression. * Hydrate `event.eventData.error` in event listings `hydrateEventData` enumerated the per-event fields that need hydration (`result`, `input`, `output`, `metadata`, `payload`) but omitted the new `error` field on `step_failed`, `step_retrying`, and `run_failed` events. Without this branch, o11y tools that list events (e.g. `workflow inspect events`) surface the raw `Uint8Array` payload instead of a hydrated `{ name, message, stack, … }` object even though the entity-level `Run.error` / `Step.error` paths already hydrate. Mirrors the existing per-field branches; the `try/catch` leaves the field un-hydrated on parse failure rather than failing the whole event view. Adds a unit test. * Use `.is()` static checks in `classifyRunError` for cross-realm safety Workflows execute inside a separate `vm` realm: the `WorkflowRuntimeError` class bundled into the workflow code and the host-imported one are distinct constructors, so an `err instanceof WorkflowRuntimeError` check on a VM-thrown error returns `false` and we'd misclassify genuine runtime errors (corrupted event log, missing timestamps, workflow/step not registered) as user errors. Switch to each subclass's `.is()` static (a name-based duck check that works across realms). Since `WorkflowRuntimeError.is` only matches its own concrete name, enumerate every concrete subclass we want to recognize (`StepNotRegisteredError`, `WorkflowNotRegisteredError`) in a `RUNTIME_ERROR_CHECKS` table; keep that table in sync with the class hierarchy in `@workflow/errors`. Existing `classify-error.test.ts` already covers `WorkflowRuntimeError` and `WorkflowNotRegisteredError` cases — both still pass. * Add e2e coverage for step throws of non-Error values We had `errorWorkflowThrowNonErrorValue` (workflow body throws a plain object — round-trips verbatim as `WorkflowRunFailedError.cause`) but no symmetric coverage for the step-throw side. Step-throw goes through a different code path: non-Error values aren't recognized as `FatalError` (no `name === 'FatalError'`) nor `RetryableError`, so they take the transient retry path. After max retries the runtime wraps the original thrown value as `cause` on a fresh `FatalError` which the workflow's catch block then sees. Add a workflow that throws a recognizable plain object from a step with `maxRetries = 0` (so we exhaust on first attempt and avoid a long test wait) and a workflow that asserts the wrapped FatalError shape: `isFatal`, `instanceof FatalError`, message includes the original object's serialized form, `cause` is the original non-Error object verbatim with structure preserved. Documents the current retry-then-wrap behavior so any future change to "non-Error throws skip retries" semantics has to update the test. * Note legacy postgres error-data loss in the run/step error changeset Pre-upgrade failed runs that wrote into world-postgres's deprecated `error` text column can't be hydrated through the new pipeline (the shape is incompatible with the new revivers). The new runtime intentionally surfaces them as `error: undefined` on read; the original payload is still readable directly from the `errorJson` column for manual inspection. Add a one-sentence note to the changeset's migration text so consumers upgrading don't get blindsided by suddenly-empty error fields on historical runs.Nathan Rajlich · 5f228326 · 2026-05-04
- 2.0ETVAdd support for custom class instance serialization (#762) Added support for custom class instance serialization across workflow/step boundaries. ### What changed? - Introduced a new `@workflow/serde` package with `WORKFLOW_SERIALIZE` and `WORKFLOW_DESERIALIZE` symbols - Enhanced the serialization system to handle custom class instances using these symbols - Updated the SWC plugin to detect classes with serialization methods and register them - Added class registry mechanism that works in both step and workflow contexts - Implemented comprehensive tests for various serialization scenarios ### How to test? The PR includes a new e2e test `customSerializationWorkflow` that demonstrates the feature: ```typescript import { WORKFLOW_SERIALIZE, WORKFLOW_DESERIALIZE } from '@workflow/serde'; // Define a class with custom serialization class Point { constructor(public x: number, public y: number) {} static [WORKFLOW_SERIALIZE](instance: Point) { return { x: instance.x, y: instance.y }; } static [WORKFLOW_DESERIALIZE](data: { x: number; y: number }) { return new Point(data.x, data.y); } } // Use in workflow and steps export async function customSerializationWorkflow(x: number, y: number) { 'use workflow'; const point = new Point(x, y); const scaled = await transformPoint(point, 2); // ... } ``` Run the e2e test to verify that class instances are properly serialized and deserialized. ### Why make this change? Previously, user-defined class instances couldn't be passed between workflows and steps without losing their prototype chain and methods. This change allows developers to define custom serialization/deserialization logic for their classes, enabling proper reconstruction of instances with their full functionality intact when crossing workflow/step boundaries.Nathan Rajlich · 1843704b · 2026-01-19
- 2.0ETVChange user input/output to be binary data at the World interface (#853)Nathan Rajlich · 1060f9d0 · 2026-01-28
- 2.0ETVdocs: move World SDK and getWorld under workflow/runtime, split out workflow/observability (#2375)Pranay Prakash · 055b6664 · 2026-06-12
- 1.9ETVRefactor: Extract serialization into modular architecture and wire into existing pipeline (#1299) * Add serialization module foundation: types, codec interface, format prefix Start of the serialization refactor (separate from snapshot-runtime). New files: - serialization/types.ts — SerializationFormat enum, SerializableSpecial interface, Reducers/Revivers types - serialization/codec.ts — Codec interface with formatPrefix, serialize, deserialize, and optional deserializeLegacy - serialization/format.ts — Format prefix encode/decode/peek, moved from the monolithic serialization.ts The Codec interface enables future alternative formats (CBOR, JSON) while keeping the devalue implementation as the current default. * Add reducers, devalue codec, encryption, and mode-specific modules Serialization refactor Phase 1: create the new module structure alongside the existing monolithic serialization.ts (which continues to work). New files: - serialization/reducers/common.ts — Date, Error, Map, Set, URL, BigInt, typed arrays, Headers, Request, Response, RegExp, URLSearchParams - serialization/reducers/class.ts — Class/Instance with WORKFLOW_SERIALIZE/ DESERIALIZE support - serialization/reducers/step-function.ts — StepFunction with closure vars - serialization/codec-devalue.ts — devalue Codec implementation - serialization/encryption.ts — composable encrypt/decrypt layer - serialization/workflow.ts — synchronous, no encryption, for VM use - serialization/step.ts — async with encryption, for step handler - serialization/client.ts — async with encryption, for start() API - serialization/index.ts — re-exports all public API - serialization/serialization.test.ts — 25 focused tests All modes compose their reducer/reviver sets from the shared building blocks. Cross-mode compatibility verified: data serialized in any mode can be deserialized in any other mode (for common types). Existing 108 serialization tests continue to pass unchanged. * Add sub-path exports for workflow serialization module - Add ./serialization/workflow export to @workflow/core package.json - Add ./internal/serialization re-export to workflow meta-package - The workflow bundle can now import serialize/deserialize via: import { serialize, deserialize } from 'workflow/internal/serialization' Full test suite passes: 493 tests across 22 files (including 25 new serialization module tests). * Address code review feedback 1. Fix reducer composition order: Class/Instance reducers now come BEFORE common reducers in all three modes (workflow, step, client). This ensures custom Error subclasses with WORKFLOW_SERIALIZE are handled by the Instance reducer before the generic Error reducer (devalue uses first-match-wins semantics). 2. Fix encryption decrypt() to fail fast when encrypted data is encountered without a decryption key, instead of silently returning encrypted bytes that would fail later with an unhelpful format error. 3. Remove Request/Response from common reducers — they don't have matching common revivers, so including them caused asymmetric behavior (serialize as Request, deserialize as plain object). Request/Response handling belongs in mode-specific modules that can provide proper revivers. 4. Document Node.js dependency in the workflow serialization re-export. The current implementation uses node:util and Buffer. For the QuickJS VM (snapshot runtime), these will need polyfills — tracked separately. * Move reducer/reviver composition into the devalue codec The Codec interface now takes a SerializationMode ('workflow', 'step', 'client') instead of raw reducers/revivers. The reducer/reviver composition is internal to the devalue codec implementation. This is the right abstraction because reducers/revivers are devalue- specific concepts. A future CBOR codec would handle Date, typed arrays, Map, Set natively via the CBOR type system — it wouldn't use reducers at all. A JSON codec would only support standard JSON types. The mode-specific modules (workflow.ts, step.ts, client.ts) are now simpler — they just pass the mode string to the codec. * Replace SerializationFormatType enum with open-ended FormatPrefix type The format prefix is now a branded string type validated by isFormatPrefix() — any 4-character [a-z0-9] string is valid. This removes the hard-coded enum of known formats, making the system truly open for extension: type FormatPrefix = string & { __brand: 'FormatPrefix' }; function isFormatPrefix(value: string): value is FormatPrefix; The SerializationFormat object still provides well-known constants ('devl', 'encr') but they're now just typed constants, not an exhaustive enum. peekFormatPrefix() and decodeFormatPrefix() use isFormatPrefix() for validation instead of checking against a known list. Unknown but valid prefixes (e.g. 'cbor', 'json', 'v2b1') are accepted — the caller decides whether they can handle the format. 6 new isFormatPrefix tests covering: valid strings, too short, too long, uppercase, special characters. 1 new test for unknown-but-valid prefixes. * Wire modular serialization modules into serialization.ts, add 138 unit tests Replace duplicate format prefix, reducer/reviver, and encryption helper code in the monolithic serialization.ts with imports from the modular serialization/ directory. This completes the refactoring started in the earlier additive-only commits. Key changes: - serialization.ts now imports types, format prefix, common/class/step-function reducers and revivers, and encryption helpers from ./serialization/ modules - Removed ~450 lines of duplicate code from serialization.ts - Made encryption error messages consistent between old and new modules - Added 138 comprehensive unit tests covering types, format prefix, encryption, codec, all three reducer modules, all three mode modules, cross-mode compatibility, and edge cases - Updated one existing test assertion for new error message wording * Address code review feedback - encryption.ts: throw WorkflowRuntimeError instead of plain Error in decrypt() to preserve the error contract from legacy maybeDecrypt() - format.ts: document that open-ended prefix validation ([a-z0-9]{4}) is intentional for forward compatibility — callers check support - errors.ts: extract duplicated formatSerializationError into shared utility, remove 4 copies from workflow.ts, step.ts, client.ts - codec-devalue.ts: document that globalThis default is a known limitation; legacy dehydrate/hydrate path still supports custom global * Fix codec-devalue.ts comment: clarify modular modules are not used in current runtime The globalThis default is not a limitation for the current runtime — all serialization goes through dehydrate*/hydrate* in serialization.ts which passes the correct global. The modular modules are infrastructure for the future snapshot runtime where serialization runs inside the VM. * Wire dehydrate/hydrate functions through modular serialize/deserialize The dehydrate*/hydrate* functions in serialization.ts now delegate to the modular mode modules (workflowModule, stepModule, clientModule) instead of directly calling devalue stringify/parse/unflatten. Key changes: - Extended Codec interface with CodecOptions (global, extraReducers, extraRevivers) so the codec can receive VM globals and mode-specific stream/Request/Response handlers - devalueCodec threads global through to all reducer/reviver factories so instanceof checks work across VM boundaries - Mode modules (workflow.ts, step.ts, client.ts) accept CodecOptions and pass them through to the codec - dehydrate*/hydrate* functions now call module serialize/deserialize with stream and Request/Response reducers/revivers passed as extras - v1Compat path remains inline (pre-codec, uses stringify + revive) - Error context strings preserved via try/catch re-wrapping * Bump changeset from patch to minor for serialization refactor Return types of public get*Reducers/get*Revivers functions narrowed from Reducers/Revivers to Partial<Reducers>/Partial<Revivers>, which is a TypeScript-level breaking change. Also adds new sub-path exports (@workflow/core/serialization/workflow, workflow/internal/serialization) which is additive. Minor bump is the appropriate semver for both. * Remove unused workflow/internal/serialization re-export and @workflow/core/serialization/workflow sub-path Both exports had zero consumers in the repo. The workflow/internal/serialization export was previously removed on main in #1082 for the same reason. The modular workflow.serialize/deserialize is still reachable via @workflow/core/serialization when needed. These exports can be reintroduced by the snapshot runtime branch if/when it actually needs them. Also updates the changeset to drop the 'new sub-path exports' bullet. * Downgrade changeset from minor to patch After auditing actual consumers of the narrowed return types (getExternalReducers/getWorkflowReducers/getExternalRevivers/getWorkflowRevivers now return Partial<Reducers>/Partial<Revivers>), no in-repo or external consumer indexes specific keys on the returned object in a way that would break. The only internal caller that did (runtime/run.ts) was updated in this same PR. The narrowing is type-safer but effectively invisible at runtime and for idiomatic callers that spread or forward the object. Since the refactor is internally restructuring only, patch is the appropriate semver bump. * Trim serialization-refactor changeset * Dedup formatSerializationError: import from serialization/errors.ts The legacy serialization.ts had its own inlined copy of formatSerializationError. Now that the helper is exported from serialization/errors.ts (already consumed by workflow.ts/step.ts/client.ts), import it here too to keep the single source of truth.Nathan Rajlich · 9f3516ec · 2026-05-01
- 1.9ETVAttributes MVP (experimental and write-only) and CI hardening (#2134) * fix(core): scan inline sourcemaps during error remapping * Attributes MVP (experimental and write-only) (#2088)Peter Wielander · 1e6b1fde · 2026-05-28
- 1.8ETVtarballs: redesign preview tarballs index page (#1911) * tarballs: redesign preview tarballs index page Rebuild the static index page produced by `tarballs/scripts/pack.ts`: - Featured `workflow` package up top with prominent install command, copy button, and direct tarball download - Top-of-page metadata chips: short SHA (linked to commit), branch, PR number, build timestamp, package count + total size - Collapsible "What is this?" explainer - Package-manager tab toggle (pnpm / npm / yarn / bun) that swaps the install command for every row in place - Live filter input over the rest of the package list (with `/` shortcut) - Per-row install command, copy button, and direct download - Modern dark/light theme with system preference, Geist-inspired styling Also captures tarball size during pack and renders human-readable byte counts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tarballs: fix client-side interactivity broken by HTML-encoded JSON `escapeHtml(JSON.stringify(catalog))` was HTML-encoding every quote in the embedded catalog JSON to `"`, so `JSON.parse(textContent)` threw on the first character and the IIFE bailed before attaching any event listeners — package-manager toggle, search filter, copy buttons, and the `/` shortcut were all dead UI on the deployed page. `<script type="application/json">` content is treated as text by the HTML parser; the only sequence that can break out is `</script>` (or `</` in legacy parsers). Replace `<` with the JSON `<` escape, which is legal per the JSON spec and prevents the breakout without needing entity encoding. Also switch `formatBytes` from `KB`/`MB` to `KiB`/`MiB` since the divisor is 1024. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tarballs: rewrite as Vite + Preact SPA with file breakdown, fix bundling Address TooTallNate's review feedback by replacing the hand-rolled HTML- in-template-literal approach with a small Vite + Preact SPA. The old ~600 lines of inlined HTML/CSS/JS in `pack.ts` is now `~80 lines of TSX`, fully type-checked. Layout: - `tarballs/index.html`, `vite.config.ts`, `tsconfig.json` at the root - `src/main.tsx` mounts the Preact app and fetches `/catalog.json` - `src/app.tsx` is the page (Header, FeaturedCard, PackageRow, etc.) - `src/catalog.ts` is the shared types + helpers (`buildInstallCommand`, `formatBytes`) - `src/icons.tsx`, `src/styles.css` - `scripts/pack.ts` is now data-only — it scans packages, packs tarballs, and writes `public/catalog.json` The eliminates several smells the reviewer called out: - The interactive script is now TypeScript with strict mode and JSX type checking instead of an inline `<script>` block - The `escapeHtml`-around-JSON-blob hack that broke client-side JS in the prior commit is gone; the SPA fetches `catalog.json` and parses it natively - Pack-time logic and presentation logic no longer share a file # Fix bundling: tarballs now actually contain compiled code While verifying real tarball sizes I noticed `workflow-serde.tgz` was only 828 bytes — it had `package.json`, `LICENSE.md`, `README.md` and *nothing* else, because each package's `files: ["dist"]` excludes sources but `dist/` hadn't been built. The Vercel build was running `pnpm --filter tarballs build`, which only builds the `tarballs` package itself — its workspace dependencies were never built. Switch `vercel.json#buildCommand` to `pnpm turbo run build --filter=tarballs`, which transitively builds dependencies first via the `dependsOn: ["^build"]` rule already in the root `turbo.json`. With the fix: workflow: 241 KiB → 252 KiB tarball, 916 KiB unpacked, 205 files @workflow/core: 59 KiB → 493 KiB tarball, 1.70 MiB unpacked, 236 files @workflow/serde: 828 B → 1.4 KiB tarball, 4.6 KiB unpacked, 7 files Add a smoke check that the `workflow` package has at least 5 files in its tarball — catches the regression directly. # Per-package contents view (packagephobia-style) `pack.ts` now also runs `tar -tvzf` on each tarball and records the file list with sizes. The SPA renders this as an expandable "What's inside?" disclosure per package, grouped by top-level directory (e.g. `dist/`, `docs/`) with proportional bars showing each group's share of the unpacked size, and the largest files listed below. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tarballs: replace tar shell-out with in-process tar reader The smoke check broke in CI: `'workflow' tarball only has 0 files`. Root cause is that `tar -tvzf` emits a different verbose layout on GNU tar (Linux, what CI runs) vs BSD tar (macOS, where I tested locally) — the parser only matched the BSD column ordering, so on Linux every line was rejected and `fileCount` came out as 0. Replace the shell-out with a small in-process tar reader using `zlib.gunzipSync` + manual 512-byte block walk. ustar headers are trivially structured (name at offset 0, octal size at 124, typeflag at 156, ustar prefix at 345). We emit regular files only (`typeflag` `0` or NUL) and consume but skip pax extended headers (`x`/`g`) and GNU long-name entries (`L`). Result is identical on every platform. Verified locally: 206 files / 998413 bytes for `workflow.tgz` matches `tar -tvzf` exactly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tarballs: redesign per-package details with packagephobia-style stats The previous "What's inside?" view crammed nested directory groups, proportional bars, and per-group file lists into a `<details>` inside an already-narrow row. It was hard to read and harder to compare. Replace it with the layout packagephobia uses on its result page: - Two large headline metric tiles (Publish size / Unpacked size) with a big bold value, smaller unit, and small uppercase label. Modeled directly on packagephobia's `Stats` component but using our existing CSS variables so it tracks light/dark theme. - A single sortable file table beneath. Default is size-descending so the contributors to package size are immediately visible. Click a header to flip direction or switch sort key. Sticky header keeps the columns visible inside the scrollable region. Drop the `groupByTopLevel`, `ContentsGroup`, and bar-chart styles — they were the source of the "hard to use" feedback and don't add information that the flat sortable table doesn't already convey. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tarballs: address Copilot review feedback (a11y, dev script, caching) - main.tsx: drop `cache: 'no-store'` from the catalog fetch. Each tarballs deployment is immutable per commit, so HTTP caching is appropriate; forcing no-store made every visit re-download the full catalog (which now includes per-package file lists). - app.tsx (search input): add `aria-label="Filter packages"`. The visible label only contained an icon and placeholder, so screen readers had no name for the control. - app.tsx (PmTabs): replace `role="tablist"` / `role="tab"` / `aria-selected` with plain buttons that use `aria-pressed`. The ARIA tab pattern requires arrow-key roving focus we never wired up; toggle buttons are the honest representation. Each button also gets an explicit `aria-label`. - app.tsx (row buttons): include the package name in the accessible label of every per-row copy/download button (and on the featured card too), so the screen reader buttons/links list distinguishes them. Added an `accessibleName` prop to `CopyButton`. - app.tsx (CopyButton): only flip to the "Copied" state when the write actually succeeded. Both the modern `navigator.clipboard` path and the `execCommand` fallback can fail; the new `writeToClipboard` helper returns success and the button shows a short "Failed" state if both paths fail. # Make `pnpm dev` work from a clean checkout The previous `dev: vite` couldn't actually serve the page because `/catalog.json` 404s and the SPA boots into the error fallback. Restructure the build layout to vite's conventional shape: - `public/` is now a true vite public dir — pack writes tarballs and catalog.json there. In dev, vite serves these at the root. - `dist/` is the production build output (vite copies public/ into it and adds index.html + assets/). - `vercel.json#outputDirectory` switches from `public` → `dist`. - `turbo.json` outputs updated to match. - `dev` chains pack before vite so the catalog exists when the dev server starts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Pranay Prakash · b883ea0d · 2026-05-04
- 1.7ETV[ai] Fixes DurableAgent telemetry missing AI SDK-compatible span attributes (#1608)Peter Wielander · 70e89bfc · 2026-04-07
- 1.7ETV[swc-plugin] Capture lexical `this` for nested arrow step functions (#1935) * [swc-plugin] Capture lexical `this` for nested arrow step functions When a nested arrow `"use step"` references the enclosing function/method's `this`, plumb that `this` through the workflow runtime so the step body sees the correct receiver. - Workflow mode wraps the step proxy with `.bind(this)`, so invoking the proxy captures the caller's `this` as `thisVal` on the queue item. - Step mode hoists the body as a regular `function` (not an arrow) so the runtime's `stepFn.apply(thisVal, args)` rebinds `this` inside the hoisted body. Detection only fires for arrows, since arrows inherit `this` lexically. Nested non-arrow functions/methods/getters/setters introduce their own `this`, so the detector stops at those boundaries. The runtime already supported `thisVal` for instance-method steps; this PR is purely a compiler change to feed the existing pipeline. Caveat: capture works at runtime only when the captured value is serializable across the workflow->step boundary (i.e. the enclosing class implements `WORKFLOW_SERIALIZE`/`WORKFLOW_DESERIALIZE`). Refs vercel/workflow#1865 * Address PR review: preserve step proxy metadata + tighter `this` detection - core: Override `.bind` on step proxies so the bound function retains `stepId` and `__closureVarsFn`. Without this, a bound proxy that flows through workflow serialization (e.g. as a step argument) would be treated as a non-serializable plain function by `getStepFunctionReducer`. - swc-plugin: Detector now also walks `arrow.params` so `this` references in default values / destructuring initializers (e.g. `(x = this.foo) => ...`) trigger the `.bind(this)` path. - swc-plugin: Class bodies inside the arrow body are now treated as `this`-binding boundaries — `this` inside class field initializers, methods, etc. is bound to the class instance, not the outer arrow. The detector still walks `extends` clauses and computed property keys because those are evaluated in the surrounding scope. - spec.md: Sharpen the note about `this` in step bodies — it's syntactically allowed but only meaningful for instance-method steps and lexical-`this` arrow steps; other shapes compile but `this` will be whatever the caller of the step proxy passes. - Add `lexical-this-detector-edge-cases` fixture covering both the default-param positive case and the inner-class false-positive guard. - Strengthen the runtime test to assert `stepId` / `__closureVarsFn` survive `.bind(...)`. * [swc-plugin] Fix `arguments` closure-var capture; drop dead `this`/`arguments` checks - Add `arguments` to `is_global_identifier` so it's not captured as a closure variable. Previously a nested `function`-form step like function step() { 'use step'; return arguments[0]; } was hoisted with `const { arguments } = ...` (a strict-mode syntax error) and the body's `arguments[0]` resolved against the destructured binding instead of the function's intrinsic `arguments` object. - Remove dead `ForbiddenExpression` checks for `this` and `arguments` in `visit_mut_this_expr` / `visit_mut_ident`. The `'use step'` / `'use workflow'` directives are stripped during the module-level traversal before children are visited, so `in_step_function` / `in_workflow_function` are never observed as true here in practice. The existing `step-with-this-arguments-super` fixture explicitly documents that all three identifiers are allowed in step bodies. - Tighten the spec note about `arguments` accordingly: it works in `function`-form steps (reflecting positional args) but is not captured for arrow-form steps; use `...args` for that case. - Add `nested-step-arguments` fixture pinning down the new behavior.Nathan Rajlich · d0e3f272 · 2026-05-05
- 1.6ETVAdd native v4 workflow attribute events (#2226) * Add native workflow attribute events * Fix abbreviated attributes docs sample * Document attribute replay ordering for step races * Address native attribute review feedback * Validate before claiming attr_set dedup lock; clearer start() attribute errors - world-local: claim the attr_set correlation lock only after validation, so a validation failure does not permanently mark the correlationId as written and wedge the run in a re-invoke loop on retry - world-postgres: distinguish a concurrently-deleted run from a cap violation when the guarded attributes update matches no rows - core: reject non-string initial attribute values in start() with a clear error instead of a downstream schema failure Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Add attribute edge-case tests across all layers - core: normalizeAttributeChanges unit tests (non-object inputs, FatalError wrapping, key/value/batch limits, boundary lengths, UTF-8 byte counting) - core: start() rejects reserved keys, oversized keys/values, and over-cap initial attribute batches before any write - world-local + world-postgres: per-run cap enforced against existing attributes (upsert-at-cap allowed, removal frees room), oversized values rejected on attr_set, invalid initial attributes rejected on run_created - e2e: validation DX workflow asserting every invalid write throws a catchable FatalError naming the violated rule and limit, with the run staying healthy; start() rejects invalid initial attributes client-side Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Remove accidentally committed local e2e diagnostics artifact Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Bump world-vercel to spec version 4 for native attributes The deployed workflow-server (vercel/workflow-server#469) materializes native attr_set events and accepts initial run attributes, but world-vercel still advertised spec v3 — so start(..., { attributes }) rejected itself client-side ('requires spec version 4') on every Vercel deployment, failing the new e2e seeding test across the prod matrix. New runs are now stamped v4. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Reject duplicate correlated attr_set before materializing in Postgres A redelivered duplicate — including one carrying different changes for the same correlationId — previously re-applied the run attributes update and only then failed the event insert, leaving the snapshot out of sync with the event log. Pre-check the event log for the correlationId before mutating; the unique index still guards the truly-concurrent race, which is idempotent (deterministic replay carries identical changes). Also apply the suggested docs wording for initial attributes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Apply suggestion from @VaguelySerious Signed-off-by: Peter Wielander <mittgfu@gmail.com> * Fail the run on World-rejected attribute writes; un-nest runtime test Two fixes from review: - runtime.test.ts: the pre-existing test "propagates transient step_created failures..." was accidentally nested inside the new attribute-race test, failing the new test ("Calling the test function inside another test function is not allowed") and preventing the old test from running. Restored it verbatim at describe level. - A workflow-body attr_set the World rejects as invalid (e.g. the cumulative per-run attribute cap, which only the World can check) is deterministic: redelivering the orchestrator message replays the same write into the same rejection, wedging the run in redelivery with no terminal event. handleSuspension now wraps such rejections in FatalError, and workflowEntrypoint fails the run with the validation error instead of rejecting the delivery. Transient storage errors still propagate and retry via redelivery. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Signed-off-by: Peter Wielander <mittgfu@gmail.com> Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Peter Wielander <mittgfu@gmail.com> Co-authored-by: Peter Wielander <peter.wielander@vercel.com>Pranay Prakash · ae8d6fee · 2026-06-11
- 1.5ETVMigrate `@workflow/web` from Next.js to React Router v7 (#1005) ## Summary - Replace Next.js App Router with React Router v7.13.0 framework mode (Vite-based), eliminating the large `next` dependency from the web, CLI, and workflow metapackages - Serve the web UI in-process from the CLI via Express instead of spawning `next start` as a child process - Switch RPC transport from JSON to CBOR to preserve binary data types across the wire - Replace `nuqs` URL state management with React Router's `useSearchParams` - Replace Next.js server actions with an RPC resource route (`/api/rpc`) and a thin CBOR-based client ## Motivation The `next` package is ~300MB installed and was the single largest dependency in the monorepo. It also required spawning a separate child process from the CLI to run the o11y web server, adding complexity around process lifecycle management, port readiness polling, and environment variable forwarding. With React Router framework mode, the web package builds to a standard Express-compatible server bundle that the CLI can import and serve directly in its own process. ## What changed **Framework swap (`@workflow/web`):** - `next.config.ts` / `postcss.config.mjs` → `react-router.config.ts` / `vite.config.ts` - `src/` directory → `app/` directory (React Router convention) - `src/app/layout.tsx` + `layout-client.tsx` → `app/root.tsx` - `src/app/page.tsx` → `app/routes/home.tsx` - `src/app/run/[runId]/page.tsx` → `app/routes/run-detail.tsx` - Path alias `@/` → `~/` - Removed all `'use client'` / `'use server'` directives **Data transport:** - Server actions → RPC resource route at `/api/rpc` with CBOR encoding - CBOR preserves `Uint8Array` and other binary types natively (no base64 overhead) - Stream reading → dedicated `/api/stream/:streamId` resource route **URL state:** - `nuqs` (`useQueryState`) → `useSearchParams` from `react-router` **Fonts:** - `next/font/google` → Geist `.woff2` files referenced directly from `node_modules/geist` via `@font-face` in CSS **CLI integration (`@workflow/cli`):** - `import('@workflow/web/server').then(m => m.startServer(port))` - No child process, no readiness polling, no cleanup handlers **Radix UI compatibility:** - `onSubmit` preventDefault on `AlertDialogContent` and `SheetContent` to prevent Radix's internal `<form method="dialog">` from triggering React Router route actions - Catch-all action on root route for any stray POSTs ## Dependencies removed - `next`, `swr`, `nuqs`, `@tailwindcss/postcss` ## Dependencies added - `react-router` / `@react-router/dev` / `@react-router/node` / `@react-router/express` (all `7.13.0`) - `express`, `vite`, `@tailwindcss/vite`, `cbor-x`, `isbot`, `cross-env` - `geist` (devDep)Nathan Rajlich · 7653e6bf · 2026-02-13
- 1.5ETVAdd support for calling start() inside workflow functions (#1133) * Add support for calling `start()` directly inside workflow functions Enable `start()` to work in workflow context by routing through an internal step (`__workflow_start`), reusing existing step infrastructure with no new event types or server changes needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR review feedback - Use typeof check instead of truthiness for WORKFLOW_START symbol - Validate start() options in workflow context (reject unsupported options like world) - Set maxRetries=0 on __workflow_start step to prevent orphaned child runs - Add unit tests for createStart factory (6 tests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make Run serializable in workflow context with step-backed methods - Add Run serialization via __serializable marker + custom Run reducer/reviver in the serialization module (avoids SWC plugin injecting class-serialization imports) - Create WorkflowRun class factory (packages/core/src/workflow/run.ts) with step-backed methods: cancel(), status, returnValue, workflowName, createdAt, startedAt, completedAt, exists - Register 8 built-in steps (__run_cancel, __run_status, etc.) in step-handler - Update __workflow_start to return full Run object (serialized → WorkflowRun in VM) - Update createStart to pass through step result directly - Update docs to reflect full Run support in workflow context Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix start() in workflow VM by delegating from api-workflow stub The workflow VM loads api-workflow.ts (via the "workflow" export condition) which stubs all runtime functions. The start stub needs to check for the injected WORKFLOW_START symbol and delegate to it, otherwise start() throws "doesn't allow this runtime usage" in the workflow context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address PR review: fix stale WORKFLOW_SERIALIZE comments and register Run in host registry - Update comments in step-handler.ts and start.ts to reference the actual serialization mechanism (Run reducer with __serializable marker) instead of the stale WORKFLOW_SERIALIZE reference - Register Run class in the host's class registry from step-handler.ts so the Run reviver can deserialize Run/WorkflowRun instances in step context Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add docs for recursive/repeating workflows and deploymentId: "latest" - Document using start() for self-chaining workflows to avoid large event logs - Add examples for batch processing and cron-like repeating patterns - Document deploymentId: "latest" option with type safety warning - Update skill file with same patterns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Return full Run object from startFromWorkflow e2e workflow Update the e2e workflow to return the childRun object directly instead of just childRun.runId, exercising Run serialization across the workflow boundary. Update e2e test assertions to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add recursive fibonacci e2e test for start() in workflow Demonstrates recursive workflow composition: fibonacciWorkflow starts new instances of itself via start() + Promise.all to compute fib(6)=8, fanning out across independent workflow runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move Run method steps to builtins with "use step" directives Refactor: instead of manually registering Run method steps via registerStepFunction in step-handler.ts, define them as proper "use step" functions in builtins.ts with __builtin_ prefix. This leverages the existing SWC plugin infrastructure — functions starting with "__builtin" get stable bare-name step IDs. - Add __builtin_run_{cancel,status,return_value,...} to both builtins files - Use dynamic import() for getRun inside step bodies to avoid pulling Node.js modules into the workflow bundle - Remove manual registerStepFunction calls from step-handler.ts - Update WorkflowRun step references to __builtin_run_* names - Fix step name display in web observability: fall back to raw name instead of "?" for built-in steps that don't follow step//module//fn format - Add fibonacciWorkflow default args for nextjs-turbopack workbench UI Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Render Run objects as clickable links in web observability UI - Add RunRef type and Run reviver to observabilityRevivers so serialized Run objects are hydrated as RunRef instead of showing raw Uint8Array - Add RunRefInline component (purple badge with run ID) that navigates to the target run on click, matching the StreamRef pattern - Thread onRunClick callback through the component chain: WorkflowTraceViewer → EntityDetailPanel → AttributePanel → DataInspector - Wire up navigation in the web app's run-detail-view - Add startFromWorkflow default args for workbench UI Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Throw error instead of silent fallback when Run class not in registry Address PR review: the Run reviver now throws if the class isn't found in the registry, instead of silently returning a plain { runId } object that would break the assumption of getting a valid Run instance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix e2e failures: allow retries on Run getter steps, fix docs code samples - Remove maxRetries=0 from read-only Run getter steps (status, returnValue, workflowName, etc.) — these are safe to retry and need retries when the child workflow hasn't completed within the step timeout. Only cancel keeps maxRetries=0. - Fix docs code samples: use correct import path (workflow/api not workflow), add declare statements for helper functions used in examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Use standard step//module//function naming for built-in steps Update the SWC plugin's __builtin_ special case to generate proper step//@workflow/core//{name} IDs instead of bare function names. This makes parseStepName work correctly for built-in steps, showing: - StepName: "Run#returnValue" (not "__builtin_run_return_value") - ModuleSpecifier: "@workflow/core" (not the raw function name) Convention: __builtin_Run_cancel → step//@workflow/core//Run#cancel (uppercase prefix + underscore → instance method # notation) - Move __workflow_start to builtins.ts as __builtin_start - Rename __builtin_run_* to __builtin_Run_* for proper # notation - Update WorkflowRun step refs to use full step// IDs - Remove manual registerStepFunction from step-handler.ts - Update SWC spec.md with new naming examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Remove SWC __builtin special case, use standard step naming for builtins Remove the SWC plugin's __builtin_ special case so built-in steps get standard step//{module}@{version}//{fn} IDs like any other step. This makes parseStepName work correctly, showing proper StepName and ModuleSpecifier in observability. The VM reconstructs the same IDs via builtinStepId() which uses the @workflow/core version to build: step//workflow/internal/builtins@{v}//{fn} - Remove __builtin special case from SWC plugin (revert to original) - Add builtinStepId() helper shared by workflow.ts, start.ts, run.ts - Rename Run steps: __builtin_Run_cancel → Run_cancel, etc. - Rename start step: __builtin_start → start - Move start step from manual registerStepFunction to builtins.ts - Keep __builtin_response_* names unchanged (pre-existing) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Use static class methods for Run steps to get Run.method naming Refactor Run method steps from standalone functions (Run_cancel) to static methods on a Run class, so the SWC plugin generates step IDs with the standard static method convention: Run.cancel, Run.returnValue, Run.status, etc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address PR review: tests, docs warnings, skill fix - Add TODO on Run.returnValue about polling blocking (replace with system hooks once AbortSignal/AbortController PR lands) - Add docs callout warning about returnValue holding workers alive - Fix SKILL.md contradiction that said start() can't be used in workflows - Enhance suspension test to assert step arguments are forwarded - Add WorkflowRun unit tests: serializable marker, runId, registry, delegation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix response builtins: adopt this-serialization from PR #1413 The rebase onto main didn't fully adopt PR #1413's refactor of response builtins to use `this` instead of explicit parameters. The old pattern (resJson(this) wrappers) passed `this` as an argument, but the step functions now expect `this` to be set via method call context. Switch to Object.defineProperties on Request/Response prototypes, matching main's approach. Also document WORKFLOW_PUBLIC_MANIFEST=1 for local e2e testing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address docs review: returnValue polling is temporary, link to start() API ref - Update returnValue warning to note this is a temporary implementation that will be replaced with internal hooks - Replace inline deploymentId: "latest" docs with link to the existing start() API reference which already covers it comprehensively Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix e2e tests: replace collectedRunIds with trackRun API PR #1426 replaced the manual collectedRunIds array with a trackRun() helper. The start() wrapper already auto-tracks, so just remove the manual push calls and add trackRun for the child run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: Pranay Prakash <pranay.gp@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>Pranay Prakash · e8898609 · 2026-03-20