Wei Dai
daiweix@meta.com
90d · built 2026-05-28
90-day totals
- Commits
- 96
- Grow
- 2.0
- Maintenance
- 4.6
- Fixes
- 5.0
- Total ETV
- 11.6
30-day trajectory
Last 30 days vs. the 30 days before. Up arrows on Growth and ETV mean improvement; up arrow on Fixes share means more time on fixes (worse).
Daily performance
Daily ETV, stacked by Growth, Maintenance and Fixes.
Work-mix over time
Share of Growth / Maintenance / Fixes over a rolling 7-day window. Reads as 'where is effort flowing right now'.
Bug flow over time
Monthly bug flow attributed to this developer. The left bar (red) is bug impact this dev authored that was addressed in the given month — combining bugs others fixed for them and bugs they fixed themselves. The right bar is fixes they personally shipped that month, split between self-fixes (overlap with the red bar) and fixes done for someone else. X-axis is fix-time, not introduction-time — the Navigara API attributes bugs backward to the author at the moment the fix lands.
- Self-fix share
- 7%
- Bugs you introduced
- 2.3
- Bugs you fixed
- 9.3
Repository spread
Where this developer's commits land. Concentrated work (top1 > 80%) vs polymath spread (top1 < 30%).
Most impactful commits
Top 20 by ETV in the 90-day window.
- 1.6ETVRefactor StateUtils to remove FLAGS_switch_id_for_testing dependency Summary: This is a follow-up of D100038728 comments to move FLAGS_switch_id_for_testing related logics from fboss/agent/state/StateUtils.cpp that could be used for production codes to test only codes at fboss/agent/test/TestUtils.cpp Refactor getMacForFirstInterfaceWithPorts(), firstInterfaceIDWithPorts(), and firstInterfaceWithPorts() in StateUtils to take explicit SwitchID instead of optional<SwitchID> with implicit FLAGS_switch_id_for_testing fallback. This separates test-only concerns from production utility code: - StateUtils functions now require an explicit SwitchID parameter, making the API clearer and removing the hidden dependency on a test-only gflag - New ForTesting() wrapper functions added to TestUtils.h/cpp that internally use FLAGS_switch_id_for_testing for convenience in test code - All ~80 test files updated to use the new ForTesting() wrappers - Non-test caller (ApplyThriftConfig.cpp) updated to explicitly provide SwitchID Reviewed By: shri-khare Differential Revision: D100174508 fbshipit-source-id: 5c14d128723be40d4921ec1124fecd1c75afb991github.com-facebook-fboss · 24e691b0 · 2026-04-11
- 0.7ETVRemove FLAGS_switch_id_for_testing dependency from AsicUtils Summary: Similar to D100174508, also remove FLAGS_switch_id_for_testing dependency from AsicUtils Decouple AsicUtils.cpp from AgentFeatures.h by adding an explicit std::optional<int32_t> switchId parameter to checkSameAndGetAsic() (default std::nullopt). Previously, checkSameAndGetAsic() internally read FLAGS_switch_id_for_testing from AgentFeatures.h, mixing test-only flag reading into production utility code. Now: - checkSameAndGetAsic() takes an explicit std::optional<int32_t> switchId - All ~170 test callers pass FLAGS_switch_id_for_testing explicitly - Production callers (ApplyThriftConfig.cpp, SwSwitch.cpp) use the default - AsicUtils.cpp no longer includes AgentFeatures.h No behavior change: all callers pass the same value they got before. This is a step toward separating test-only gflags from production code. Reviewed By: shri-khare Differential Revision: D100712562 fbshipit-source-id: 35e647201a5ba19b6b64e0e96262c9195a7334c7github.com-facebook-fboss · c7e03186 · 2026-04-15
- 0.6ETVAdd ASIC_TYPE_JERICHO4 type definition Summary: Imported codes from Arista team: Define the Jericho4 ASIC type, J4SIM platform type, and 1.6T port speed enums. - Add ASIC_TYPE_JERICHO4 = 24 to switch_config.thrift - Add ONEPOINTSIXT = 1600000 port speed - Add PLATFORM_J4SIM = 49 to fboss_common.thrift - Add J4SIM_NIF to platform_mapping_config.thrift - Add Jericho4Asic class header (inherits Jericho3Asic) - Add PLATFORM_J4SIM toString in PlatformMode.h Next in stack: Add SaiBcmJ4SimPlatform new platform files Reviewed By: nivinl Differential Revision: D94533656 fbshipit-source-id: 1515d5358039307da0a5989d8f2d7df3d696c5cdgithub.com-facebook-fboss · 7a850e64 · 2026-02-27
- 0.6ETVexplicitly drop packets instead of throw FbossError by using getInterfaceIDForPortIf() Summary: Change SwitchState::getInterfaceIDForPort() to return std::optional<InterfaceID> instead of throwing FbossError when the port/aggregate port is not found. Use getNodeIf() instead of getNode() to avoid exception thrown on the RX hot path, which is more expensive. All callers updated to handle the optional return. Reviewed By: jasmeetbagga Differential Revision: D96000016 fbshipit-source-id: 2c2fc6248c7b9d75ede5b18873f90f7ed82c2094github.com-facebook-fboss · f4ca2cef · 2026-03-12
- 0.6ETVVerify no hyper-port flap on remote warmboot Summary: Add a multinode hyper-port warmboot test that verifies the primary test switch sees no control-plane flap while a remote EDSW agent does a graceful warmboot restart. The test establishes hyper-port NDP first, snapshots sticky control-plane anti-flap signals on the primary EDSW, restarts the remote EDSW, waits for recovery, then checks that aggregate-port flap counters, member link flap counters, and remote hyper-port NDP resolvedSince values did not change. This diff also fixes the neighbor sw-agent restart path used by the test. The previous graceful-restart handler synchronously ran systemctl restart from inside the service's own Thrift RPC handler. That created a self-dependency during shutdown: the service waited for outstanding Thrift requests to finish while the restart RPC itself was blocked waiting for systemd to stop the same service. On dsf_hyper_port mono deployments that hit the systemd stop timeout, PID 1 sent SIGABRT, warmboot markers were lost, and the neighbor came back coldboot. The handler now schedules restart asynchronously through a transient systemd unit so the RPC returns before the service stops. Differential Revision: D103037112 fbshipit-source-id: 5e7a7f82310f8ef8d882aad4b7befd87d23f3c84github.com-facebook-fboss · b44c80df · 2026-04-30
- 0.5ETVFix wedge_agent crash in createMirrorOnDropReport on VOQ switches Summary: wedge_agent enters a crash loop P2266240748 on VOQ switches whose switchId is non-zero (e.g. switchId=28). The crash occurs during initial config apply when createMirrorOnDropReport() calls utility::getMacForFirstInterfaceWithPorts(), which internally uses FLAGS_switch_id_for_testing (defaults to 0) to look up interfaces. On switches with non-zero switchId, no interfaces exist under switchId=0, causing a FbossError that propagates to XLOG(FATAL) in SwSwitchInitializer::initThread(). The fix has two parts: 1. In createMirrorOnDropReport(), pass getAnyVoqSwitchId() explicitly to getMacForFirstInterfaceWithPorts() so it looks up interfaces under the correct switchId. 2. In getMacForFirstInterfaceWithPorts() and firstInterfaceIDWithPorts(), actually use the passed-in switchId parameter — previously these functions ignored it (parameter was commented out as /*switchId*/) and always fell back to FLAGS_switch_id_for_testing. Reviewed By: shri-khare, nivinl, simuthus-fb Differential Revision: D100038728 fbshipit-source-id: 349bb88570bfa88d52b9b920c426941fd4906785github.com-facebook-fboss · 4a90cff2 · 2026-04-09
- 0.3ETVWire up Jericho4 support in existing code Summary: As titled, Imported codes from Arista team: Reviewed By: simuthus-fb, nivinl Differential Revision: D94533964 fbshipit-source-id: f10dbc4237cb6e429e352482b56a87b8bcbfb4c2github.com-facebook-fboss · 5d26f186 · 2026-03-01
- 0.3ETVexplicitly dropped udp rx packets due to size too small rather than throw exceptions twice Summary: As titled, one of the minor improvements suggested by AI when working on SEV follow-up task T258450131 (packet rx logics cleanup). Add a non-throwing tryParse() method to UDPHeader that checks buffer length before parsing, returning false instead of throwing FbossError on truncated packets. This avoids expensive exception unwinding (~3-12us per throw) in the packet RX hot path. Differential Revision: D96000021 fbshipit-source-id: 843b925536088943e99e5b4f122174dba8df04f7github.com-facebook-fboss · 72893b9a · 2026-03-14
- 0.3ETVadd minimal platform descriptor registry Summary: Add `PlatformDescriptorRegistry`, a class that loads `PlatformDescriptor` JSON configs from a directory tree organized as `<root>/<system_vendor>/<platform_name>/`. The registry is a cached singleton: all vendor directories are scanned once on first access via `get()`, and subsequent calls return the same instance. Supports platform type lookup by product name prefix or mode name, and lazy loading of sibling `platform_mapping.json` files. Gated behind `--platform_descriptor_config_path`: when the path is non-empty, descriptors are loaded; when empty (default), no descriptors are loaded and the legacy code path runs, see detailed design proposal at https://docs.google.com/document/d/1JdZYgp1C1imDh9I0DBMo_LDCOx5F51iemJyAl1JknFo/edit?usp=sharing Reviewed By: srikrishnagopu Differential Revision: D104303533 fbshipit-source-id: c71dcf2263c5b0ff14c6af996a113de469462c73github.com-facebook-fboss · a2806ce7 · 2026-05-21
- 0.3ETVLog port name and counter group on stats collection failure Summary: When SAI port stats collection fails, the error message only shows the opaque PortSaiId (e.g., "Failed to get stats PortSaiId(3302829850625)") with no indication of which port or which counter group failed. This makes debugging stats failures difficult. Add a collectStats helper lambda in SaiPortManager::updateStats() that wraps each handle->port->updateStats() call. On SaiApiError, it logs the port name and counter group (e.g., "Failed to get port counters for port rcy1/1/441 (portId: 1)") and swallows the exception so stats collection continues for remaining counter groups and subsequent ports. Counter groups labeled: basic port counters, fast LLFC trigger status, MAC TX data queue watermarks, PFC duration stats, FEC correctable/uncorrectable frames, FEC corrected bits, FEC codeword errors. Reviewed By: simuthus-fb Differential Revision: D101035324 fbshipit-source-id: 1a0869b600ae2d4a30a91f8a6d9e349ac83624e3github.com-facebook-fboss · 4f3cc60b · 2026-04-16
- 0.3ETVSkip PFC attributes for hyper port member Summary: Per the hyperport SAI attributes proposal, SAI_PORT_ATTR_PRIORITY_FLOW_CONTROL_MODE, SAI_PORT_ATTR_PRIORITY_FLOW_CONTROL, SAI_PORT_ATTR_PRIORITY_FLOW_CONTROL_RX, and SAI_PORT_ATTR_PRIORITY_FLOW_CONTROL_TX should only be set on HyperPort (HP=Yes) and not on HyperPort Member (Member=No). The PFC configuration for hyper port members should be determined by the hyper port. This change skips PFC attributes during port creation (attributesFromSwPort) and also guards the runtime PFC programming path (programPfc) to return early for hyper port member ports. Reviewed By: simuthus-fb Differential Revision: D98312146 fbshipit-source-id: 5753fd8579f68e0cacfc1d6c5a94124e7c829225github.com-facebook-fboss · db6d67e6 · 2026-03-26
- 0.2ETVScope mirror to egress port's switch on multi-NPU Summary: Current Mirror feature implementation seems always assumes only one NPU. This assumptions no longer true on multi-NPUs platforms like janaga. There are two solutions to fully support mirror on multi-NPU: 1. ERSPAN/sFlow via fabric (industry approach): Use a single mirror with one egress port. The encapsulated mirror packet routes through the fabric to reach the egress port from any NPU. Requires the SAI mirror to use a system port (not physical port) so all NPUs can resolve it. 2. Per-switch egress port (D102353102): Each NPU programs its own mirror with a local egress port via switchIdToDestination config, but this requires enhancement to exists fboss agent thrift config. <<<< this is the apporach fboss will take shri-khare: what is the desired approach fboss will take? by reading your D95725555, I think we prefer approach 2? If so, this diffs can resolve the failing AgentEgressPortErspanMirroringTest testing given current mirror config. Additional changes like D102353102 might be needed later if for example we want to explicitly program different egress port or tunnel on the second NPU. On multi-NPU platforms like Janga800bic, updateMirrors() previously returned a flat MirrorMap that was converted to MultiSwitchMirrorMap via toMultiSwitchMap(). This cloned each mirror for ALL switches. When a mirror has a physical egress port (e.g., port 32780 on NPU 1), the clone sent to NPU 0's hw_agent crashes because SaiMirrorManager::getMonitorPort() can't find port 32780 in NPU 0's port table. Fix: updateMirrors() now builds the MultiSwitchMirrorMap directly, scoping each mirror to the switch that owns its egress port. Each NPU gets its own inner MirrorMap, so mirrors with the same name can coexist across NPUs without key collision. Mirrors without an egress port (tunnel-only ERSPAN/SFLOW) are still replicated to all L3 switches. The call site no longer needs the toMultiSwitchMap conversion. Reviewed By: nivinl Differential Revision: D102254668 fbshipit-source-id: 734d6122466d9d6e574c4a33c35695ba41443de5github.com-facebook-fboss · e88d2e0e · 2026-05-08
- 0.2ETVsupport burst and pacing in FBOSS sendPkt thrift API via numOfPkts/intervalInMs Summary: Enable controlled packet bursts by adding repeat count and inter-packet interval to packet Thrift APIs, defaulting to one packet and a 10ms gap. This lets engineers generate predictable traffic and timing without client-side loops, improving testability and reproducibility. Context of why this change: https://fb.workplace.com/groups/907812870130827/permalink/1950611849184252/ --- AI generated Summary & Test Plan from DEV68678869 Reviewed By: msomasundaran Differential Revision: D93528525 fbshipit-source-id: c53d3846c6e4f6279ad177c8b7eb7f182b85a74fgithub.com-facebook-fboss · 875159d9 · 2026-03-17
- 0.2ETVAdd DSF hyper port traffic distribution test Summary: Add verifyHyperPortTrafficDistribution test case to AgentMultiNodeVoqSwitchHyperPortTests.cpp. This test verifies traffic distribution across and within hyper ports: 1. Resolves NDP neighbors on both EDSWs via ping (same approach as verifyNdpBehindHyperPorts in D100818567). 2. Creates ECMP routes for a traffic loop: edsw1_1 routes to edsw1_2's hyper port IPs as nexthops and vice versa. 3. Injects 1000 packets from the test driver to create a self-sustaining traffic loop 4. Verifies: - Inter-hyper-port spray: traffic is evenly distributed across four hyper ports (max 25% deviation) - Intra-hyper-port spray: within each hyper port, traffic is evenly distributed across four member ports (max 25% deviation) 5. Explicitly logs out_bytes for every member port and hyper port, plus deviation values, for debugging. Also adds helper methods: - computeNeighborsForEdswHyperPorts: computes neighbor info per hyper port using getHyperPortInterfaceIPs() and getInterfaceMac() (avoids getMacForFirstInterfaceWithPorts which fails for EDSWs due to FLAGS_switch_id_for_testing defaulting to 0). - injectTraffic: sends UDP packets with varying source ports for hash distribution. - verifyHyperPortTrafficSpray: checks port counter distribution with detailed per-port logging. Test is gated by FLAGS_hyper_port and skips when the flag is not set. ___ overriding_review_checks_triggers_an_audit_and_retroactive_review Oncall Short Name: fboss_agent_dsf Differential Revision: D100818566 fbshipit-source-id: 84b9718f6d4070e9348fe1621e8c45cc93de0de2github.com-facebook-fboss · 15011a4c · 2026-04-15
- 0.2ETVSkip global flow control mode for hyper port member Summary: Per the hyperport SAI attributes proposal, SAI_PORT_ATTR_GLOBAL_FLOW_CONTROL_MODE should only be set on HyperPort (HP=Yes) and not on HyperPort Member (Member=No). The flow control mode for hyper port members should be determined by the hyper port. Previously, globalFlowControlMode was set for all port types without any port-type filtering. Reviewed By: nivinl Differential Revision: D98312054 fbshipit-source-id: f9508d2c4ddaa694c13ffb266681995380aebb33github.com-facebook-fboss · 7b4cca5b · 2026-03-27
- 0.2ETVPropagate fabric connectivity thrift failures Summary: ThriftHandler::getFabricConnectivity treated a missing HwSwitch fabric connectivity response as a fatal invariant, but HwSwitchThriftClientTable returns nullopt for recoverable thrift failures such as rate limiting or connection errors. Replace the CHECK with FbossError propagation so callers receive a thrift error without restarting wedge_agent, and add a typed regression test for the HwSwitch failure path. Reviewed By: jasmeetbagga Differential Revision: D104914989 fbshipit-source-id: 50b6e4d957c3bf1a75add11e491e9bdb45eadabegithub.com-facebook-fboss · f52617fe · 2026-05-13
- 0.2ETVFix clear interface counters not resetting Input Discards Summary: `fboss2 clear interface counters` fails to clear "Input Discards" shown by `fboss2 show interface errors`. The root cause is that `inDiscards_` is a software-accumulated counter (using `+=` with `subtractIncrements`) that persists across stats collection cycles. The `clearStats()` path only resets the SAI/BCM hardware counters but never zeroes the accumulated `inDiscards_` value in the software stats struct. After clearing, the accumulator stops growing (because `subtractIncrements` returns 0 on counter rollover) but retains its previously accumulated value. This fix resets `inDiscards_` to 0 in the software stats when clearing port stats, following the existing `clearInterfacePhyCounters` pattern which already correctly handles this for FEC counters. SAI: In `SaiPortManager::clearStats()`, after clearing HW counters, reset `inDiscards_` in `portStats_` and the fb303 monotonic counter. Differential Revision: D95144589 fbshipit-source-id: 19e48e8388c0c9bdd9ef12dc085c2dbccaa7dd57github.com-facebook-fboss · 00f98118 · 2026-03-05
- 0.2ETVfix stale interface routes after edsw hyper port migration Summary: Fix a bug in processRemoteInterfaceRoutes (VoqUtils.cpp) where the cancel-out logic matched route add/delete operations solely by CIDR prefix without checking the interface ID. When a route prefix moved between interfaces during a DSF state update (e.g. during EDSW hyper port migration), operations for different interfaces would incorrectly cancel each other, leaving stale or missing routes in the FIB. Two fixes applied: - add=true path: Always add to toAdd and cancel any pending delete for the same prefix. Previously, when a pending delete was cancelled, the route was not added to toAdd, so addOrReplaceRouteImpl never re-pointed the route to the new interface. - add=false path: Only cancel a pending add if it belongs to the same interface. If the pending add is for a different interface, skip the delete — the add will replace the route via addOrReplaceRouteImpl. Also adds DBG3 logging to processRemoteInterfaceRoutes and DsfStateUpdaterUtil::updateNeighborEntry to show prefix, interface ID/name, and decision reasoning for future debugging. Reviewed By: shri-khare, jasmeetbagga Differential Revision: D94960177 fbshipit-source-id: f6e7336f8ea07aff64b93df98a26c1a8fb0d9343github.com-facebook-fboss · 9e1378b8 · 2026-04-02
- 0.2ETVFix DSF node naming for multi-NPU test configs Summary: In production, all NPUs on the same device share the same DSF node name. The test config was generating unique names per NPU (e.g. "hwTestSwitch0", "hwTestSwitch2"), which caused FabricConnectivityManager to compute wrong expectedSwitchId values for the second NPU. The formula `baseSwitchId + virtualDeviceId` assumes baseSwitchId is the device-level base, but unique names made baseSwitchId equal to the NPU's own switchId. For example, on meru800bfa with switch_id_for_testing=2, port 2540 got expectedSwitchId = 2 + 3 = 5, but actual was 3. With shared names, baseSwitchId = 0 and expectedSwitchId = 0 + 3 = 3 (correct). Differential Revision: D100239160 fbshipit-source-id: 09a38a6bcd02bcdc819f1bd3a6a9d031bc6eb2d9github.com-facebook-fboss · 58775d58 · 2026-04-10
- 0.2ETVBack out "Use SAI_STATS_MODE_READ_AND_CLEAR for fabric control packet stats" Summary: In agent conveyor, https://www.internalfb.com/conveyor/fboss/wedge_agent/releases R3: AgentFabricSwitchTest.reachDiscard J3: 70 test cases failed Claude analysis shows D102175308 might be the reason. So, backout D102175308 for now to unblock Original commit changeset: 8563e42a8fe2 Original Phabricator Diff: D102175308 Reviewed By: simuthus-fb, Tianyu-Meta Differential Revision: D102373383 fbshipit-source-id: efdfae8e523da729ab04a30e254598cab45c1184github.com-facebook-fboss · 61b3fa67 · 2026-04-24