MoE router mismatch dashboard

70 headline runs (4 engine pairs); 0 additional runs in the full-engine appendix. Devices observed: L40S (g6e.12xlarge). Each cell measures how often the MoE router's top-k experts disagree between two checkpoints / engines.

Research question

Does FP8 quantization preserve the routing decisions of a Mixture-of-Experts model? If gradients flow through the router gate, set-level routing changes show up as different parameter updates even when top-1 is stable.

What we observed

Headline: FP8 quantization barely shifts the dominant routed expert (top-1 flip rate 8.3% — about 1 in 12 (token, layer) pairs) but reorders the rest of the top-k almost everywhere (100.0% of tokens have set-level disagreement on at least one MoE layer). The top-1 looks robust; the top-k set is noise. The full-engine data is in the appendix below.

Next: Multi-prompt run (this is n=1 currently); per-layer stratified analysis to see if early/late layers differ.

70
headline runs
4
headline engine pairs
8.3%
worst top-1 flip rate
100.0%
worst top-k set disagreement

Top-1 flip-rate traffic-light by engine pair

Green: ≤ 5% (top-1 router stable). Amber: 5–15%. Red: > 15% (the dominant expert often changes between engines).

hermes-qwen3-30b-a3b-bf16 → fsdp-bf16-moe
6.0%
mean top-1 flip rate over 19 runs
hermes-qwen3-30b-a3b-bf16 → megatron-bf16-moe
5.7%
mean top-1 flip rate over 19 runs
hermes-qwen3-30b-a3b-fp8 → fsdp-bf16-moe
8.3%
mean top-1 flip rate over 16 runs
hermes-qwen3-30b-a3b-fp8 → megatron-bf16-moe
8.3%
mean top-1 flip rate over 16 runs

Top-1 flip vs top-k set disagreement

Left: how often the dominant routed expert changes between rollout and trainer. Right: how often any of the top-k experts changes on at least one layer. The gap between the two is the headline finding — quantization noise barely shifts the top-1 but reorders the rest of the top-k.

Raw numbers — tables for the headline charts (click to expand)

Per (rollout_engine -> trainer_engine) pair

rollouttrainercountmean flip ratemean token disagreementworst layer fliplayerstokens
hermes-qwen3-30b-a3b-bf16fsdp-bf16-moe190.06010.99760.333348703
hermes-qwen3-30b-a3b-bf16megatron-bf16-moe190.05720.99720.375048703
hermes-qwen3-30b-a3b-fp8fsdp-bf16-moe160.08291.00000.416748685
hermes-qwen3-30b-a3b-fp8megatron-bf16-moe160.08341.00000.416748685

Per run

run_idmodelenginesdevicetokenslayerstop_kexpertsflip ratetoken disagreementlayer minlayer meanlayer max
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-code-fix-factorial-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)244881280.05561.00000.00000.05560.2500
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-code-fix-factorial-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)544881280.05561.00000.00000.05560.1296
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-code-fix-factorial-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)374881280.04951.00000.00000.04950.1351
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-cwc-repo-quality-loop-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)784881280.08311.00000.00000.08310.2051
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-cwc-repo-quality-loop-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)684881280.04231.00000.00000.04230.1176
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-cwc-repo-which-hook-commits-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)394881280.07001.00000.00000.07000.2308
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-cwc-repo-which-hook-commits-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)454881280.09631.00000.00000.09630.2667
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-cwc-repo-which-hook-commits-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)334881280.04991.00000.00000.04990.1818
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-math-tokens-per-gpu-day-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)354881280.05770.97140.00000.05770.2286
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-math-tokens-per-gpu-day-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)84881280.06251.00000.00000.06250.2500
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-mixed-research-and-math-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)444881280.06161.00000.00000.06160.2500
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-mixed-research-and-math-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)354881280.05001.00000.00000.05000.2000
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-mixed-research-and-math-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)364881280.07121.00000.00000.07120.2222
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-mixed-research-and-math-turn3Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)304881280.09511.00000.00000.09510.2667
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-mixed-research-and-math-turn4Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)64881280.05561.00000.00000.05560.3333
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-no-op-trivia-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)224881280.03501.00000.00000.03500.1364
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-plan-3step-experiment-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)284881280.05431.00000.00000.05430.1786
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-plan-3step-experiment-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)244881280.06601.00000.00000.06600.2083
hermes-qwen3-30b-a3b-bf16-vs-fsdp-bf16-moe-plan-3step-experiment-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> fsdp-bf16-moeL40S (g6e.12xlarge)574881280.03000.98250.00000.03000.0877
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-code-fix-factorial-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)244881280.04951.00000.00000.04950.2083
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-code-fix-factorial-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)544881280.04941.00000.00000.04940.1111
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-code-fix-factorial-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)374881280.05351.00000.00000.05350.1892
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-cwc-repo-quality-loop-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)784881280.07561.00000.00000.07560.1923
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-cwc-repo-quality-loop-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)684881280.04231.00000.00000.04230.1176
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-cwc-repo-which-hook-commits-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)394881280.06521.00000.00000.06520.2308
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-cwc-repo-which-hook-commits-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)454881280.09121.00000.00000.09120.2444
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-cwc-repo-which-hook-commits-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)334881280.05431.00000.00000.05430.1818
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-math-tokens-per-gpu-day-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)354881280.04461.00000.00000.04460.1714
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-math-tokens-per-gpu-day-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)84881280.06251.00000.00000.06250.3750
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-mixed-research-and-math-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)444881280.05491.00000.00000.05490.2273
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-mixed-research-and-math-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)354881280.03991.00000.00000.03990.2286
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-mixed-research-and-math-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)364881280.07001.00000.00000.07000.1944
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-mixed-research-and-math-turn3Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)304881280.08061.00000.00000.08060.2000
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-mixed-research-and-math-turn4Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)64881280.06601.00000.00000.06600.3333
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-no-op-trivia-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)224881280.03031.00000.00000.03030.0909
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-plan-3step-experiment-turn0Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)284881280.05131.00000.00000.05130.1786
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-plan-3step-experiment-turn1Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)244881280.07551.00000.00000.07550.1667
hermes-qwen3-30b-a3b-bf16-vs-megatron-bf16-moe-plan-3step-experiment-turn2Qwen/Qwen3-30B-A3Bhermes-qwen3-30b-a3b-bf16 -> megatron-bf16-moeL40S (g6e.12xlarge)574881280.03030.94740.00000.03030.0877
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-code-fix-factorial-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)244881280.06421.00000.00000.06420.2083
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-code-fix-factorial-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)544881280.07451.00000.01850.07450.2037
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-code-fix-factorial-turn2Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)374881280.07151.00000.00000.07150.1622
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-cwc-repo-quality-loop-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)784881280.10841.00000.00000.10840.2436
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-cwc-repo-quality-loop-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)684881280.06401.00000.00000.06400.1912
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-cwc-repo-which-hook-commits-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)394881280.09991.00000.00000.09990.2308
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-cwc-repo-which-hook-commits-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)454881280.10881.00000.00000.10880.3333
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-cwc-repo-which-hook-commits-turn2Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)334881280.06571.00000.00000.06570.2121
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-math-tokens-per-gpu-day-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)354881280.06901.00000.00000.06900.2000
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-math-tokens-per-gpu-day-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)84881280.12761.00000.00000.12760.3750
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-mixed-research-and-math-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)444881280.07771.00000.00000.07770.2045
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-mixed-research-and-math-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)354881280.06431.00000.00000.06430.2000
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-mixed-research-and-math-turn2Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)364881280.09261.00000.00000.09260.2222
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-mixed-research-and-math-turn3Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)124881280.11981.00000.00000.11980.4167
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-plan-3step-experiment-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)284881280.06701.00000.00000.06700.2500
hermes-qwen3-30b-a3b-fp8-vs-fsdp-bf16-moe-plan-3step-experiment-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> fsdp-bf16-moeL40S (g6e.12xlarge)1094881280.05181.00000.00920.05180.1376
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-code-fix-factorial-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)244881280.06421.00000.00000.06420.2083
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-code-fix-factorial-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)544881280.06371.00000.00000.06370.1296
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-code-fix-factorial-turn2Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)374881280.08001.00000.00000.08000.1622
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-cwc-repo-quality-loop-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)784881280.10981.00000.00000.10980.2308
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-cwc-repo-quality-loop-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)684881280.06041.00000.00000.06040.1912
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-cwc-repo-which-hook-commits-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)394881280.10421.00000.00000.10420.2564
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-cwc-repo-which-hook-commits-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)454881280.10971.00000.00000.10970.2667
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-cwc-repo-which-hook-commits-turn2Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)334881280.07011.00000.00000.07010.1818
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-math-tokens-per-gpu-day-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)354881280.06491.00000.00000.06490.2000
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-math-tokens-per-gpu-day-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)84881280.11721.00000.00000.11720.3750
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-mixed-research-and-math-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)444881280.07951.00000.00000.07950.2727
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-mixed-research-and-math-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)354881280.06731.00000.00000.06730.2286
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-mixed-research-and-math-turn2Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)364881280.10191.00000.00000.10190.2500
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-mixed-research-and-math-turn3Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)124881280.12851.00000.00000.12850.4167
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-plan-3step-experiment-turn0Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)284881280.06321.00000.00000.06320.1786
hermes-qwen3-30b-a3b-fp8-vs-megatron-bf16-moe-plan-3step-experiment-turn1Qwen/Qwen3-30B-A3B-FP8hermes-qwen3-30b-a3b-fp8 -> megatron-bf16-moeL40S (g6e.12xlarge)1094881280.04951.00000.00920.04950.1101

What these metrics mean

top-1 flip rate

Alias of `router_flip_rate` — see that entry for cap/drop.

per
per-(token, layer).
cap
no cap — observability only.
on cap
no drop — observability only.
top-k set disagreement

Alias of `token_expert_disagreement_rate` — see that entry for cap/drop.

per
per-token.
cap
no cap — observability only.
on cap
no drop — observability only.
router_flip_rate

Fraction of (token, layer) pairs where MoE top-1 routed expert differs between rollout and trainer.

per
per-(token, layer); dashboards display the mean.
cap
no cap — observability only. OPBC does not directly act on router_flip_rate; the metric informs whether to pin the rollout to the trainer's precision class via `PolicyManifest`.
on cap
no drop — observability only. Decisions to reject a worker on precision mismatch happen at validator time, not at metric time.

Top-1 stability under quantization or precision changes. Low values (<5%) suggest the dominant expert is robust. Same as the top-1 flip rate shown on the router dashboard.

token_expert_disagreement_rate

Fraction of tokens with at least one MoE layer where the top-k *set* of routed experts differs.

per
per-token (a token counts if any layer disagrees).
cap
no cap — observability only.
on cap
no drop — observability only.

More sensitive than top-1 flip — quantization noise often shuffles the lower-ranked experts even when the dominant one is stable. Visible in the gap between this and the top-1 flip rate.