Adding nnx Gemma2-2b (including overall fixes) to examples/gemma #4587

mdda · 2025-02-28T05:52:32Z

Adds Gemma2-2b (including GQA to Attention and fixes to Block)

It includes:

New TransformerConfig for gemma2-2b model
Renames existing configs to make them uniform
- NB: This cannot affect existing code, since this module was previously unusable
Added additional params reading key adjustment (key in gemma2-2b needs remapping from kaggle download)
Adds GQA to Attention module
Reorders the operations in Block module so that logits output from overall Transformer are not gibberish
- logits confirmed to (approximately) match those from GDE gemma (flax linen) model
No new documentation provided
- This change would make the example in the nnx documentation work (did not work before)
No additional tests provided

cgarciae · 2025-03-04T20:33:34Z

Hey Martin! Thanks for doing this.
Some folks are internally also improving the model. Let me merge their changes first and make sure there are no conflicts with those changes.

mdda · 2025-03-05T05:27:44Z

Ahhh - Brings back memories of PRs for TensorFlow. Good times! /s

I'll attempt fix the white-space check failures, when you let me know that it isn't a waste of my time.

cgarciae · 2025-03-14T20:46:37Z

examples/gemma/transformer.py

  @classmethod
-  def gemma_27b(cls):
-    num_layers = 46
+  def gemma2_2b(cls):


Why don't we add a new method for gemma_27b configuration instead of removing the gemma2_2b one?

They are both in the file - it's just the git that didn't pick up the diff properly.

But also, it makes sense to rename the different 'generations' of gemma, gemma2, gemma3, etc as separate classes, rather than relying on implicit knowledge about which size came from where.

Moreover, the gemma2_27 didn't have the right normalisation in the attention - will need a separate fix.

I also see that the Google-generated PR adopted the same fix as mine to the Block module. Good for you!

cgarciae · 2025-03-14T20:50:04Z

Thanks @mdda for doing this, it took a while because there where other changes to the model in the background that were pending. I think this is great. Can you please add a test to check that the GQA configuration works and matches the base version?

Also now need to solve the conflicts, sorry about this, this code is being used by a couple of users internally.

mdda · 2025-03-15T10:41:04Z

Surprised that the code was being used internally prior to my PR, since the Block module was entirely borked.

cgarciae · 2025-03-20T17:00:54Z

@mdda please take a look at CI, you probably need to run:

pip install pre-commit
pre-commit run --all-files

vfdev-5 · 2025-07-21T16:21:44Z

Hey @mdda , I wonder whether you would be able to finalize this PR? If you are busy, we can take over it keeping your commits (such that you will be credited for the work you have done). Please, let me know. Thanks!

mdda added 4 commits February 23, 2025 14:18

Initial changes

ca11a95

gemma2-2b config added, others rationalised

16a8109

Removed debugging prints...

6687838

Merge branch 'google:main' into gemma2-2b

0fa970e

cgarciae reviewed Mar 14, 2025

View reviewed changes

mdda mentioned this pull request Apr 16, 2025

Implement Gemma 2 ml-gde/jaxgarden#12

Closed

vfdev-5 mentioned this pull request Jul 22, 2025

Fixed Gemma example using Gemma2 models #4830

Merged

4 tasks

copybara-service bot closed this in #4830 Sep 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adding nnx Gemma2-2b (including overall fixes) to examples/gemma #4587

Adding nnx Gemma2-2b (including overall fixes) to examples/gemma #4587

Uh oh!

mdda commented Feb 28, 2025

Uh oh!

cgarciae commented Mar 4, 2025

Uh oh!

mdda commented Mar 5, 2025 •

edited

Loading

Uh oh!

cgarciae Mar 14, 2025

Uh oh!

mdda Mar 15, 2025

Uh oh!

cgarciae commented Mar 14, 2025

Uh oh!

mdda commented Mar 15, 2025

Uh oh!

cgarciae commented Mar 20, 2025

Uh oh!

vfdev-5 commented Jul 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Adding nnx Gemma2-2b (including overall fixes) to examples/gemma #4587

Adding nnx Gemma2-2b (including overall fixes) to examples/gemma #4587

Uh oh!

Conversation

mdda commented Feb 28, 2025

Adds Gemma2-2b (including GQA to Attention and fixes to Block)

Uh oh!

cgarciae commented Mar 4, 2025

Uh oh!

mdda commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgarciae Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

mdda Mar 15, 2025

Choose a reason for hiding this comment

Uh oh!

cgarciae commented Mar 14, 2025

Uh oh!

mdda commented Mar 15, 2025

Uh oh!

cgarciae commented Mar 20, 2025

Uh oh!

vfdev-5 commented Jul 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mdda commented Mar 5, 2025 •

edited

Loading