Skip to content

fix(gpt-oss): emit fused raw expert tensors for SGLang#2004

Open
Jiang020609 wants to merge 1 commit into
THUDM:mainfrom
Jiang020609:fix/gpt-oss-raw-expert-conversion
Open

fix(gpt-oss): emit fused raw expert tensors for SGLang#2004
Jiang020609 wants to merge 1 commit into
THUDM:mainfrom
Jiang020609:fix/gpt-oss-raw-expert-conversion

Conversation

@Jiang020609
Copy link
Copy Markdown
Contributor

Summary

Fixes #1840.

This fixes GPT-OSS raw Megatron-to-HF conversion for the non-colocate SGLang weight update path.

Previously, GPT-OSS expert tensors were emitted as per-expert gate_proj / up_proj / down_proj names. SGLang expects fused expert tensors for GPT-OSS, so the raw converter produced tensors that could not be loaded correctly by SGLang's fused MoE weight loader.

Changes

  • Convert linear_fc1.weight into interleaved fused gate_up_proj.
  • Transpose linear_fc2.weight before emitting down_proj.
  • Fuse per-expert tensors into 3D tensors shaped [num_experts, ...] before returning them to the weight update path.
  • Apply the same fused naming/layout for gate_up_proj_bias and down_proj_bias.
  • Add CPU unit tests for GPT-OSS raw expert weight and bias conversion.

Validation

  • python -m ruff check slime/backends/megatron_utils/megatron_to_hf/gpt_oss.py tests/test_gpt_oss_raw_converter.py
  • python -m pytest tests/test_gpt_oss_raw_converter.py

GPU smoke on 1x NVIDIA A800-SXM4-80GB with lmsys/gpt-oss-20b-bf16:

  • SGLang successfully loaded the GPT-OSS checkpoint.
  • Prepared fused update tensors:
    • model.layers.0.mlp.experts.gate_up_proj: (32, 2880, 5760)
    • model.layers.0.mlp.experts.down_proj: (32, 2880, 2880)
    • model.layers.0.mlp.experts.gate_up_proj_bias: (32, 5760)
    • model.layers.0.mlp.experts.down_proj_bias: (32, 2880)
  • engine.update_weights_from_tensor(...) returned (True, 'Success').
  • Deterministic generation before and after the no-op raw expert update matched.
  • Smoke result: GPT-OSS raw SGLang smoke passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] GPT-OSS raw converter emits incorrect expert weight format, causing gibberish output with --megatron-to-hf-mode raw

1 participant