-
Notifications
You must be signed in to change notification settings - Fork 39
Open
Description
Hi, thank you for sharing the nice work!
I tried to merge LoRA weights into base model by using your scripts:
BLOOM-LORA/export_state_dict_checkpoint.py
Lines 31 to 33 in 040ba59
for layer in lora_model.base_model.model.model.layers: | |
layer.self_attn.q_proj.merge_weights = True | |
layer.self_attn.v_proj.merge_weights = True |
I wonder why you only set
merge_weights = True
for q_proj
and v_proj
, but leave out for v_proj
and o_proj
? As I found LoRA weights is set for q_proj
, v_proj
, v_proj
and o_proj
:
for n, p in base_model_llama.named_parameters():
if "lora_" not in n:
p.requires_grad = False
else:
print(n, p.requires_grad)
=========== output
model.layers.0.self_attn.q_proj.lora_A.weight True
model.layers.0.self_attn.q_proj.lora_B.weight True
model.layers.0.self_attn.k_proj.lora_A.weight True
model.layers.0.self_attn.k_proj.lora_B.weight True
model.layers.0.self_attn.v_proj.lora_A.weight True
model.layers.0.self_attn.v_proj.lora_B.weight True
model.layers.0.self_attn.o_proj.lora_A.weight True
model.layers.0.self_attn.o_proj.lora_B.weight True
....
Metadata
Metadata
Assignees
Labels
No labels