Indexed to Q1'25, covering Q1'25–Q1'26. Built from the Navigara Knowledge Graph's per-commit growth, maintenance, and waste signals.
Abstract
Engineering output is difficult to benchmark across organizations: totals reward commit volume, per-developer averages conceal composition changes, and cross-team comparisons are distorted by size and language mix. This research note applies the Navigara Knowledge Graph's per-commit scoring — expressed in — to a cohort of public organizations, and reports aggregate per-developer performance together with its decomposition into growth, maintenance, and waste. All quarterly aggregates carry computed from per-commit observations over the open cohort of contributors active in each quarter (switch the cohort toggle at the top to the ).
Over the reporting window, output per developer and work composition both shifted materially. Section 2 presents each finding with its confidence band; Section 3 details the cohort, window, and statistical construction.
Limits
Scope is merged code on public default branches only. Qualitative dimensions — review depth, incident response, planning, mentorship — are not captured. Work on unmerged branches, private forks, or unconnected repositories is invisible to the measurement.
Symbols & Units
What the report measures and how to read it.
This research note benchmarks engineering output across a cohort of public organizations tracked by the Navigara Knowledge Graph. The measurement unit is ETV (Engineering Throughput Value), scored per merged commit and rolled up to quarters.
Over the reporting window, aggregate per-developer output expanded and work composition shifted in the direction documented below. Section 2 presents each finding with its 95% bootstrap confidence band; Section 3 spells out how the cohort, window, and statistics were constructed.
Quarterly aggregates, work composition, and the per-organization view.
Did output per developer grow across the window?
The index below tracks absolute per-developer performance quarter by quarter, against a baseline fixed at the start of the window. The denominator is the open cohort of SWE contributors active in each quarter independently — changes therefore reflect both output per engineer and shifts in who was contributing. Switch the cohort toggle at the top of the page to "Fixed panel" to restrict the denominator to engineers active in every quarter.
The shaded band is the 95% bootstrap confidence interval around the point estimate.
Absolute aggregate output per active developer each quarter; annotations show cumulative change vs the baseline quarter. Shaded band: 95% bootstrap CI.
Allocation of effort between new capability, upkeep, and waste.
Aggregate quarterly effort is decomposed into growth (new capability), maintenance (upkeep), and waste (no lasting value). Bars are normalized to 100% per quarter, so shifts between bars reflect priorities, not raw volume.
Share of total effort allocated to growth, maintenance, and waste; bars normalize to 100% per quarter. Tooltip values include the 95% bootstrap CI for each segment.
Side-by-side totals and trends across the cohort.
Each organization contributes independently to the aggregate. Cards below summarize per-organization level and trend; the table immediately after lists totals per quarter and the change versus baseline.
Did changes come from more commits, or denser commits?
Two productivity ratios are compared quarter by quarter: commits per active developer, and performance score per commit. Together they separate volume from value — the same output can come from more commits or denser ones.
Left axis: commits per active developer each quarter. Right axis: performance score per commit. Divergence between the two is diagnostic — parallel curves suggest consistent commit granularity; widening gaps suggest changing commit practice. Shaded bands: 95% bootstrap CI.
How the cohort, the window, and the statistics were constructed.
All metrics derive from per-commit growth, maintenance, and waste scores produced by the Navigara Knowledge Graph. Aggregates sum the raw scores; ratios divide by the distinct active-contributor or commit count observed in the same quarter.
Bootstrap confidence intervals are computed locally over the same per-commit observations. Organizations without analyzable activity in the window are surfaced explicitly rather than silently dropped.
Full methodology, glossary, and limitations are documented in /methodology. API reference available in /docs.
Q1'25→Q1'26
Aggregate performance score per organization, by quarter. Unit: Engineering Throughput Value (ETV).