[Bug] trtllm-gen attention kernel issues for batch size 1

Description of the issue and test case added in [PR1897](https://github.com/flashinfer-ai/flashinfer/pull/1897). Reproduced here:

Trtllm-gen's attention kernels have been discovered to fail tests when batch size is 1. When batch size 1, 

[STILL REQUIRES A FIX] `test_trtllm_batch_decode`: produces incorrect outputs with newly added parameters
```
## Running pytest ./tests/attention/test_trtllm_gen_attention.py::test_trtllm_batch_decode -v
>                   torch.testing.assert_close(
                        output.float(),
                        output_wrapper.float(),
                        rtol=1e-1,
                        atol=1e-1,
                    )
E                   AssertionError: Tensor-likes are not close!
E                   
E                   Mismatched elements: 1480 / 8192 (18.1%)
E                   Greatest absolute difference: 64.021484375 at index (0, 46, 106) (up to 0.1 allowed)
E                   Greatest relative difference: 1.625 at index (0, 56, 109) (up to 0.1 allowed)
```

[UPDATE: NOW FIXED in #1912] `test_trtllm_gen_prefill_deepseek`: can trigger an IMA with the newly added parameters
```
## Running pytest ./tests/attention/test_trtllm_gen_attention.py::test_trtllm_gen_prefill_deepseek -v
>           default_generator.manual_seed(seed)
E           torch.AcceleratorError: CUDA error: an illegal memory access was encountered
E           CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
E           For debugging consider passing CUDA_LAUNCH_BLOCKING=1
E           Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

/opt/conda/envs/py312/lib/python3.12/site-packages/torch/cuda/random.py:129: AcceleratorError
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] trtllm-gen attention kernel issues for batch size 1 #1898

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] trtllm-gen attention kernel issues for batch size 1 #1898

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions