Skip to content

Commit d5ed90e

Browse files
committed
janus: image generation
1 parent a16e294 commit d5ed90e

File tree

15 files changed

+1282
-111
lines changed

15 files changed

+1282
-111
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ LittleAcademia[<a href="https://github.com/foldl/little-academia" style="text-
3131

3232
**What's New:**
3333

34+
* 2025-10-10: [I can draw](./docs/multimodal.md): Janus-Pro
3435
* 2025-09-23: Qwen2.5-VL
3536
* 2025-09-15: Ling/Ring-mini-2.0
3637
* 2025-09-08: GroveMoE

convert.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7730,6 +7730,17 @@ def dump_config(f, config, ggml_type):
77307730
assert config.vision_config['params']['model_name'] == 'siglip_large_patch16_384'
77317731
JanusConverter.lang_config = AttributeDict(config.language_config)
77327732

7733+
DEF = {
7734+
"hidden_size": 4096,
7735+
"intermediate_size": 11008,
7736+
"num_attention_heads": 32,
7737+
"num_hidden_layers": 32,
7738+
}
7739+
7740+
for k in DEF:
7741+
if k not in JanusConverter.lang_config:
7742+
JanusConverter.lang_config[k] = DEF[k]
7743+
77337744
JanusConverter.lang_config.hidden_act = 'silu'
77347745

77357746
DeepSeekConverter.dump_config(f, JanusConverter.lang_config, ggml_type)
@@ -8138,6 +8149,11 @@ def main():
81388149
config = AttributeDict({})
81398150

81408151
if arch == '':
8152+
if config.architectures is None:
8153+
if "model_type" in config:
8154+
if config.model_type == "multi_modality":
8155+
config.architectures = ['MultiModalityCausalLM']
8156+
81418157
if len(config.architectures) != 1:
81428158
raise Exception(f'unknown architectures: {config.architectures}')
81438159
arch = config.architectures[0]

docs/models.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -353,6 +353,11 @@ Please use `--format completion` for these models.
353353

354354
Note: Only download `tokenizer.model` and DO NOT download `tokenizer.json` when converting. Use `--set do-pan-and-scan 1` to enable _Pan and Scan_.
355355

356+
* Janus (`MultiModalityCausalLM`)
357+
* [x] Pro: [1B](https://huggingface.co/deepseek-ai/Janus-Pro-1B/tree/960ab33191f61342a4c60ae74d8dc356a39fafcb), [7B](https://huggingface.co/deepseek-ai/Janus-Pro-7B/tree/5c3eb3fb2a3b61094328465ba61fcd4272090d67)
358+
359+
Note: Use `--set parallel-size N` to generate `N` images in a single run (default: 2); `--set gen-head-temperature T` to set temperature of `gen-head` (default: 1.0). Add prefix "/gen " to prompts to generate images.
360+
356361
* Kimi (`KimiVLForConditionalGeneration`)
357362
* [x] VL: [A3B-Instruct](https://huggingface.co/moonshotai/Kimi-VL-A3B-Instruct/tree/7a3c132a7b0f1f1677f5a72f258bd3afded7d357), [A3B-Thinking](https://huggingface.co/moonshotai/Kimi-VL-A3B-Thinking/commit/16681d8ac24e505088698e4e34ea494dd6e24400), [A3B-Thinking-2506](https://huggingface.co/moonshotai/Kimi-VL-A3B-Thinking-2506/tree/f124f44fb6ab5778cfac5117e3902ef03e860ad4)
358363

0 commit comments

Comments
 (0)