Skip to content

[Bug] DeepSeek V32 CUDA error: an illegal memory access was encountered #12893

@Johnsonms

Description

@Johnsonms

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

Image

Reproduction

B200
python -m sglang.launch_server --model-path model/DeepSeek-V3.2-Exp-FP4 --served-model-name togethercomputer/DeepSeek-V3.2-Exp-FP4 --tp 8 --dp 8 --enable-dp-attention --reasoning-parser deepseek-v3 --kv-cache-dtype fp8_e4m3 --modelopt-quant nvfp4 --trust-remote-code --disable-radix-cache --speculative-algorithm EAGLE --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4 --max-prefill-tokens 8192 --grammar-backend xgrammar --chat-template /sgl-workspace/sglang/examples/chat_template/tool_chat_template_deepseekv32.jinja

Environment

B200

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions