[Bug]: [V1] Molmo/Aria not supported on V1 due to xgrammar

### Your current environment

Cannot use these models on V1 due to Xgrammar assert

### 🐛 Describe the bug

- run the following
```bash
VLLM_USE_V1=1 pytest -s -x models/decoder_only/vision_language/test_models.py -k molmo
VLLM_USE_V1=1 pytest -s -x models/decoder_only/vision_language/test_models.py -k aria
```

- get the following back
```bash
ERROR 03-10 03:06:35 [core.py:324] EngineCore hit an exception: Traceback (most recent call last):
ERROR 03-10 03:06:35 [core.py:324]   File "/home/rshaw/vllm/vllm/v1/engine/core.py", line 316, in run_engine_core
ERROR 03-10 03:06:35 [core.py:324]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 03-10 03:06:35 [core.py:324]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 03:06:35 [core.py:324]   File "/home/rshaw/vllm/vllm/v1/engine/core.py", line 271, in __init__
ERROR 03-10 03:06:35 [core.py:324]     super().__init__(vllm_config, executor_class, log_stats)
ERROR 03-10 03:06:35 [core.py:324]   File "/home/rshaw/vllm/vllm/v1/engine/core.py", line 65, in __init__
ERROR 03-10 03:06:35 [core.py:324]     self.structured_output_manager = StructuredOutputManager(vllm_config)
ERROR 03-10 03:06:35 [core.py:324]                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 03:06:35 [core.py:324]   File "/home/rshaw/vllm/vllm/v1/structured_output/__init__.py", line 44, in __init__
ERROR 03-10 03:06:35 [core.py:324]     tokenizer_info = xgr.TokenizerInfo.from_huggingface(
ERROR 03-10 03:06:35 [core.py:324]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 03:06:35 [core.py:324]   File "/home/rshaw/vllm/venv/lib/python3.12/site-packages/xgrammar/tokenizer_info.py", line 188, in from_huggingface
ERROR 03-10 03:06:35 [core.py:324]     raise ValueError(msg)
ERROR 03-10 03:06:35 [core.py:324] ValueError: Input vocab_size less than minimum viable vocab size for tokenizer <class 'vllm.transformers_utils.tokenizer.get_cached_tokenizer.<locals>.CachedTokenizer'>.
```

Seems to be due to the relative sizes of the tokenizer vocab and model vocab


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: [V1] Molmo/Aria not supported on V1 due to xgrammar #14534

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: [V1] Molmo/Aria not supported on V1 due to xgrammar #14534

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions