Skip to content

Error running generate_bloom.py #5

@Sekhar-jami

Description

@Sekhar-jami

When running generate_bloom.py with BASE_MODEL = "bigscience/bloom-7b1" & LORA_WEIGHTS = "LinhDuong/bloom-7b1-alpaca", I'm getting the below error:

Traceback (most recent call last): File "generate_bloom.py", line 32, in model = PeftModel.from_pretrained(model, LORA_WEIGHTS, torch_dtype=torch.float16) File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/peft/peft_model.py", line 172, in from_pretrained model.load_adapter(model_id, adapter_name, **kwargs)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/peft/peft_model.py", line 361, in load_adapter
set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/peft/utils/save_and_load.py", line 120,in set_peft_model_state_dict
model.load_state_dict(peft_model_state_dict, strict=False)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1672,in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.transformer.h.0.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.0.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.1.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.1.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.2.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.2.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.3.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.3.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.4.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.4.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.5.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.5.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.6.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.6.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.7.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.7.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.8.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.8.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.9.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.9.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.10.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.10.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.11.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.11.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.12.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.12.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.13.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.13.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.14.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.14.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.15.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.15.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.16.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.16.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.17.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.17.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.18.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.18.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.19.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.19.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.20.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.20.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.21.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.21.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.22.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.22.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.23.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.23.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.24.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.24.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.25.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.25.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.26.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.26.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.27.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.27.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.28.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.28.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
size mismatch for base_model.model.transformer.h.29.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
size mismatch for base_model.model.transformer.h.29.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions