-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
Description
Name of failing test
test_eagle_correctness[TRITON_ATTN-qwen3_eagle3]
Basic information
- Flaky test
- Can reproduce locally
- Caused by external libraries (e.g. bug in
transformers)
🧪 Describe the failing test
Summary
EAGLE speculative decoding is failing on AMD MI325X (gfx942) GPUs with a HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION error. The error occurs immediately when processing prompts starts, after successful CUDA graph capturing.
It can be reproduced with spec decode + EAGLE + Triton Attention Backend(default for AMD), example:
python3 offline_inference/spec_decode.py --test --method eagle --num_spec_tokens 3 --dataset-name hf --dataset-path philschmid/mt-bench --num-prompts 80 --temp 0 --top-p 1.0 --top-k -1 --tp 1 --enable-chunked-prefill --max-model-len 2048
Error Details
:0:rocdevice.cpp:3675: Callback: Queue 0x7f4db0300000 aborting with error:
HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
(using a different backend could work: eg. ROCM_AITER_FA, ROCM_AITER_UNIFIED_ATTN, but TritonAttentionBackend is the default attention backend for AMD: gist:fb8bbb2cbde391905d86908ca4a46c02)
What Works
- Model initialization
- Weight loading (target model + EAGLE draft model)
- CUDA graph capturing (both PIECEWISE and FULL modes)
- KV cache allocation
What Fails
Prompt processing - Fails immediately when execute_model is called, before processing even the first token
Error Call Stack
multiproc_executor.py:694 worker_busy_loop
→ worker_base.py:353 execute_model
→ gpu_worker.py:491 execute_model
→ gpu_model_runner.py:2512 execute_model
→ gpu_model_runner.py:2404 _model_forward
→ self.model() [MEMORY VIOLATION]Affecting tests: V1 Test e2e + engine, Example Test
📝 History of failing test
https://buildkite.com/vllm/ci/builds/36286#019a1ead-5029-45e3-b576-d1ac8cd5ac43
https://buildkite.com/vllm/amd-ci/builds/632#019a27b6-9f6a-4b99-842d-55eef24ea7cd
CC List.
@mxz297 @yeqcharlotte @Alexei-V-Ivanov-AMD @luccafong @njhill @LucasWilkinson
Metadata
Metadata
Assignees
Labels
Type
Projects
Status