A cross-organization benchmark of engineering output and work composition

Indexed to Q1'25, covering Q1'25–Q1'26. Built from the Navigara Knowledge Graph's per-commit growth, maintenance, and waste signals.

Published: April 24, 2026
Version: rolling · updated hourly
Source: Navigara Knowledge Graph

Abstract

Engineering output is difficult to benchmark across organizations: totals reward commit volume, per-developer averages conceal composition changes, and cross-team comparisons are distorted by size and language mix. This research note applies the Navigara Knowledge Graph's per-commit scoring — expressed in — to a cohort of public organizations, and reports aggregate per-developer performance together with its decomposition into growth, maintenance, and waste. All quarterly aggregates carry computed from per-commit observations over the open cohort of contributors active in each quarter (switch the cohort toggle at the top to the ).

Over the reporting window, output per developer and work composition both shifted materially. Section 2 presents each finding with its confidence band; Section 3 details the cohort, window, and statistical construction.

Limits

Scope is merged code on public default branches only. Qualitative dimensions — review depth, incident response, planning, mentorship — are not captured. Work on unmerged branches, private forks, or unconnected repositories is invisible to the measurement.

Symbols & Units

ETV: Engineering Throughput Value; Dimensionless unit for scored output per commit. Additive within a work type.
n: Sample size; Number of underlying observations (commits or SWE contributors) in the resampled population.
B: Bootstrap replications; Number of resampling iterations drawn to approximate the sampling distribution. B = 10 000 throughout.
CI₉₅: 95% confidence interval; Band from the 2.5th to 97.5th percentile of the bootstrap replications.

1Introduction

What the report measures and how to read it.

This research note benchmarks engineering output across a cohort of public organizations tracked by the Navigara Knowledge Graph. The measurement unit is ETV (Engineering Throughput Value), scored per merged commit and rolled up to quarters.

Over the reporting window, aggregate per-developer output expanded and work composition shifted in the direction documented below. Section 2 presents each finding with its 95% bootstrap confidence band; Section 3 spells out how the cohort, window, and statistics were constructed.

2Results

Quarterly aggregates, work composition, and the per-organization view.

Per-developer output Δ: +84.4%; Q1'25 → Q1'26
Growth share · Q1'26: 36.2%; 45.4% maintenance · 18.4% fixes
Commits per developer Δ: +22.7%; Q1'25 → Q1'26
Cohort: 6 orgs; 66 repos · Q1'25–Q1'26

2.1Aggregate per-developer performance

Did output per developer grow across the window?

The index below tracks absolute per-developer performance quarter by quarter, against a baseline fixed at the start of the window. The denominator is the open cohort of SWE contributors active in each quarter independently — changes therefore reflect both output per engineer and shifts in who was contributing. Switch the cohort toggle at the top of the page to "Fixed panel" to restrict the denominator to engineers active in every quarter.

The shaded band is the 95% bootstrap confidence interval around the point estimate.

Figure 1 · Per-developer performance [ETV]

Absolute aggregate output per active developer each quarter; annotations show cumulative change vs the baseline quarter. Shaded band: 95% bootstrap CI.

2.2Quarterly work composition

Allocation of effort between new capability, upkeep, and waste.

Aggregate quarterly effort is decomposed into growth (new capability), maintenance (upkeep), and waste (no lasting value). Bars are normalized to 100% per quarter, so shifts between bars reflect priorities, not raw volume.

Figure 2 · Work composition by quarter

Share of total effort allocated to growth, maintenance, and waste; bars normalize to 100% per quarter. Tooltip values include the 95% bootstrap CI for each segment.

2.3Per-organization comparison

Side-by-side totals and trends across the cohort.

Each organization contributes independently to the aggregate. Cards below summarize per-organization level and trend; the table immediately after lists totals per quarter and the change versus baseline.

Microsoft

12 repositories

vs baseline

+117.7%

Growth

42%

Maintenance

35%

Fixes

23%

2.4Throughput vs commit density

Did changes come from more commits, or denser commits?

Two productivity ratios are compared quarter by quarter: commits per active developer, and performance score per commit. Together they separate volume from value — the same output can come from more commits or denser ones.

Figure 3 · Throughput and commit density [ETV]

Left axis: commits per active developer each quarter. Right axis: performance score per commit. Divergence between the two is diagnostic — parallel curves suggest consistent commit granularity; widening gaps suggest changing commit practice. Shaded bands: 95% bootstrap CI.

3Data & methodology

How the cohort, the window, and the statistics were constructed.

All metrics derive from per-commit growth, maintenance, and waste scores produced by the Navigara Knowledge Graph. Aggregates sum the raw scores; ratios divide by the distinct active-contributor or commit count observed in the same quarter.

Bootstrap confidence intervals are computed locally over the same per-commit observations. Organizations without analyzable activity in the window are surfaced explicitly rather than silently dropped.

Organizations: 6
Repositories: 66
Commits analyzed: 100,377
Window: Q1'25–Q1'26

Full methodology, glossary, and limitations are documented in /methodology. API reference available in /docs.

1Introduction

What the report measures and how to read it.

3Data & methodology

How the cohort, the window, and the statistics were constructed.

Organizations: 6
Repositories: 66
Commits analyzed: 100,377
Window: Q1'25–Q1'26

Full methodology, glossary, and limitations are documented in /methodology. API reference available in /docs.

Organization	Q1'25	Q2'25	Q3'25	Q4'25	Q1'26	Δ vs Q1'25
Microsoft	592	620	614	657	1,288	+117.7%
Meta	486	639	642	566	793	+63.2%
Google	302	365	410	481	576	+90.6%
Vercel	224	233	271	319	608	+172.0%
Cloudflare	140	189	288	310	451	+221.5%
OpenAI	15	52	174	223	633	+4184.0%

1Introduction

2Results

2.1Aggregate per-developer performance

Figure 1 · Per-developer performance [ETV]

2.2Quarterly work composition

Figure 2 · Work composition by quarter

2.3Per-organization comparison

Microsoft

2.4Throughput vs commit density

Figure 3 · Throughput and commit density [ETV]

3Data & methodology

1Introduction

2Results

2.1Aggregate per-developer performance

Figure 1 · Per-developer performance [ETV]

2.2Quarterly work composition

Figure 2 · Work composition by quarter

2.3Per-organization comparison

Microsoft

2.4Throughput vs commit density

Figure 3 · Throughput and commit density [ETV]

3Data & methodology

Meta

Google

Vercel

Cloudflare

OpenAI

Detailed Per-Quarter Totals