[New Question] Attention with Linear Biases (Medium) #78

June24-Wu · 2025-09-07T04:20:48Z

New Question: Attention with Linear Biases

Question is from paper: Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation (https://arxiv.org/pdf/2108.12409)

kunal-mansukhani · 2025-09-12T02:28:43Z

challenges/medium/52_alibi/challenge.html

@@ -0,0 +1,107 @@
+<p> Implement Attention with Linear Biases (ALiBi) for a given set of matrices. 
+  Given the query matrix <code>Q</code> of size <code>M×d</code>, key matrix <code>K</code> of size <code>N×d</code>, and value matrix
+  <code>V</code> of size <code>N×d</code>, your program should compute the output matrix using the formula:


would be nice to link the paper in the spec

kunal-mansukhani · 2025-09-12T02:29:16Z

challenges/medium/52_alibi/challenge.html

+  Given the query matrix <code>Q</code> of size <code>M×d</code>, key matrix <code>K</code> of size <code>N×d</code>, and value matrix
+  <code>V</code> of size <code>N×d</code>, your program should compute the output matrix using the formula:
+  $$\text{Attention}_{ALiBi}(Q, K, V) = \text{softmax}\Bigl( \frac{QK^T}{\sqrt{d}} + \alpha \cdot \Delta \Bigr)V$$  
+</p>


52 is taken, do 55

kunal-mansukhani · 2025-09-12T02:29:34Z

challenges/medium/52_alibi/challenge.html

+</p>
+
+<p>
+  where &alpha; is a slope controlling the linear bias and <code>&Delta; = i - j</code> represents the relative position between query <code>i</code> and key <code>j</code>. 


change the dir name from alibi to attn_w_linear_bias

See the new commit to include paper and new dir name

[Feat] New Question: Attention with Linear Biases

ca7ec74

June24-Wu requested review from ishaan-arya and kunal-mansukhani as code owners September 7, 2025 04:20

kunal-mansukhani reviewed Sep 12, 2025

View reviewed changes

June added 3 commits September 11, 2025 20:00

Update dir name and paper spec

08fb51a

test CI

703fc89

test CI

6e8b0aa

June24-Wu changed the title ~~[Feat] New Question: Attention with Linear Biases~~ [New Question] Attention with Linear Biases (Medium) Sep 13, 2025

kunal-mansukhani approved these changes Sep 16, 2025

View reviewed changes

kunal-mansukhani merged commit 4526e64 into AlphaGPU:main Sep 16, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[New Question] Attention with Linear Biases (Medium) #78

[New Question] Attention with Linear Biases (Medium) #78

Uh oh!

June24-Wu commented Sep 7, 2025 •

edited

Loading

Uh oh!

kunal-mansukhani Sep 12, 2025

Uh oh!

kunal-mansukhani Sep 12, 2025

Uh oh!

kunal-mansukhani Sep 12, 2025

Uh oh!

June24-Wu Sep 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[New Question] Attention with Linear Biases (Medium) #78

[New Question] Attention with Linear Biases (Medium) #78

Uh oh!

Conversation

June24-Wu commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kunal-mansukhani Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

kunal-mansukhani Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

kunal-mansukhani Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

June24-Wu Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

June24-Wu commented Sep 7, 2025 •

edited

Loading

June24-Wu Sep 12, 2025 •

edited

Loading