Error running generate_bloom.py

When running generate_bloom.py with BASE_MODEL = "bigscience/bloom-7b1" & LORA_WEIGHTS = "LinhDuong/bloom-7b1-alpaca", I'm getting the below error:


Traceback (most recent call last):  File "generate_bloom.py", line 32, in <module>    model = PeftModel.from_pretrained(model, LORA_WEIGHTS, torch_dtype=torch.float16)  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/peft/peft_model.py", line 172, in from_pretrained    model.load_adapter(model_id, adapter_name, **kwargs)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/peft/peft_model.py", line 361, in load_adapter
    set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/peft/utils/save_and_load.py", line 120,in set_peft_model_state_dict
    model.load_state_dict(peft_model_state_dict, strict=False)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1672,in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.transformer.h.0.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.0.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.1.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.1.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.2.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.2.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.3.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.3.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.4.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.4.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.5.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.5.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.6.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.6.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.7.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.7.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.8.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.8.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.9.self_attention.query_key_value.lora_A.default.weight: copying aparam with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.9.self_attention.query_key_value.lora_B.default.weight: copying aparam with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.10.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.10.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.11.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.11.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.12.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.12.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.13.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.13.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.14.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.14.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.15.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.15.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.16.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.16.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.17.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.17.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.18.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.18.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.19.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.19.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.20.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.20.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.21.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.21.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.22.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.22.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.23.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.23.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.24.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.24.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.25.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.25.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.26.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.26.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.27.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.27.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.28.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.28.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).
        size mismatch for base_model.model.transformer.h.29.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
        size mismatch for base_model.model.transformer.h.29.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]).



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error running generate_bloom.py #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Error running generate_bloom.py #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions