[Bug][CI Failure]: EAGLE Spec Decode failing with Triton Attention Backend

### Name of failing test

test_eagle_correctness[TRITON_ATTN-qwen3_eagle3]

### Basic information

- [ ] Flaky test
- [x] Can reproduce locally
- [ ] Caused by external libraries (e.g. bug in `transformers`)

### 🧪 Describe the failing test

## Summary
EAGLE speculative decoding is failing on AMD MI325X (gfx942) GPUs with a `HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION` error. The error occurs immediately when processing prompts starts, after successful CUDA graph capturing. 
It can be reproduced with spec decode + EAGLE + Triton Attention Backend(default for AMD), example:
```
python3 offline_inference/spec_decode.py --test --method eagle --num_spec_tokens 3 --dataset-name hf --dataset-path philschmid/mt-bench --num-prompts 80 --temp 0 --top-p 1.0 --top-k -1 --tp 1 --enable-chunked-prefill --max-model-len 2048
```

## Error Details
```
:0:rocdevice.cpp:3675: Callback: Queue 0x7f4db0300000 aborting with error:
HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
```

(using a different backend could work: eg. `ROCM_AITER_FA`, `ROCM_AITER_UNIFIED_ATTN`, but `TritonAttentionBackend` is the **default**  attention backend for AMD: [gist:fb8bbb2cbde391905d86908ca4a46c02](https://gist.github.com/zhewenl/fb8bbb2cbde391905d86908ca4a46c02))

### What Works
1. Model initialization
2. Weight loading (target model + EAGLE draft model)
3. CUDA graph capturing (both PIECEWISE and FULL modes)
4. KV cache allocation 

### What Fails
**Prompt processing** - Fails immediately when `execute_model` is called, before processing even the first token

### Error Call Stack
```python
multiproc_executor.py:694 worker_busy_loop
  → worker_base.py:353 execute_model
    → gpu_worker.py:491 execute_model
      → gpu_model_runner.py:2512 execute_model
        → gpu_model_runner.py:2404 _model_forward
          → self.model() [MEMORY VIOLATION]
```

Affecting tests: V1 Test e2e + engine, Example Test

### 📝 History of failing test

https://buildkite.com/vllm/ci/builds/36286#019a1ead-5029-45e3-b576-d1ac8cd5ac43
https://buildkite.com/vllm/amd-ci/builds/632#019a27b6-9f6a-4b99-842d-55eef24ea7cd

### CC List.

@mxz297 @yeqcharlotte @Alexei-V-Ivanov-AMD @luccafong @njhill @LucasWilkinson 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug][CI Failure]: EAGLE Spec Decode failing with Triton Attention Backend #27619

Name of failing test

Basic information

🧪 Describe the failing test

Summary

Error Details

What Works

What Fails

Error Call Stack

📝 History of failing test

CC List.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug][CI Failure]: EAGLE Spec Decode failing with Triton Attention Backend #27619

Description

Name of failing test

Basic information

🧪 Describe the failing test

Summary

Error Details

What Works

What Fails

Error Call Stack

📝 History of failing test

CC List.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions