Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load albert model error #147

Open
phc4valid opened this issue May 20, 2021 · 6 comments
Open

load albert model error #147

phc4valid opened this issue May 20, 2021 · 6 comments

Comments

@phc4valid
Copy link

File "run_kbert_cls.py", line 261, in main
model.load_state_dict(torch.load(args.pretrained_model_path), strict=False)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1044, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Model:
size mismatch for embedding.segment_embedding.weight: copying a param with shape torch.Size([1, 128]) from checkpoint, the shape in current model is torch.Size([3, 128]).

您好,
這是我載入albert-base-chinese 時發生的問題 ,使用的config.json為uer所提供之
{"emb_size": 128, "feedforward_size": 3072, "hidden_size": 768, "heads_num": 12, "layers_num": 12, "dropout": 0.0}
想請問這裡的segment_embedding 能由哪裡做修改? 很感謝妳們的付出 萬分感謝 stay safe

@hhou435
Copy link
Collaborator

hhou435 commented May 21, 2021

您好,这是转换脚本中的一个bug,已经更新了转换脚本,您可以重新测试一下,非常感谢您对项目的关注

@phc4valid
Copy link
Author

您好,我用了新的轉換腳本,後續有新的錯誤
Traceback (most recent call last):
File "run_kbert_cls.py", line 625, in
main()
File "run_kbert_cls.py", line 588, in main
loss, _ = model(input_ids_batch, label_ids_batch, mask_ids_batch, pos=pos_ids_batch, vm=vms_batch)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "run_kbert_cls.py", line 53, in forward
output = self.encoder(emb, mask, vm)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/workplace/phchen/K-BERT-master/uer/encoders/bert_encoder.py", line 48, in forward
hidden = self.transformer[i](hidden, mask)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/workplace/phchen/K-BERT-master/uer/layers/transformer.py", line 38, in forward
inter = self.dropout_1(self.self_attn(hidden, hidden, hidden, mask))
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in forward
query, key, value = [l(x).
File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in
query, key, value = [l(x).
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 91, in forward
return F.linear(input, self.weight, self.bias)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/functional.py", line 1676, in linear
output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [4096 x 128], m2: [768 x 768] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

使用之config.json為
{"emb_size": 128, "feedforward_size": 3072, "hidden_size": 768, "heads_num": 12, "layers_num": 12, "dropout": 0.0}
想請問m1與m2指的為何? 該如何解決問題 萬分謝謝您的時間 謝謝 @hhou435

@hhou435
Copy link
Collaborator

hhou435 commented May 21, 2021

您好,可以提供一下您的运行命令吗

@phc4valid
Copy link
Author

https://github.com/autoliuweijie/K-BERT?utm_source=catalyzex.com
您好 , 我執行的是連結裡的run_kbert_cls.py .
請問我應該要額外提供什麼資訊給您比較好解決問題呢 ?
感謝您

@hhou435
Copy link
Collaborator

hhou435 commented May 21, 2021

有关albert的微调可以参考这里https://github.com/dbiir/UER-py/wiki/下游任务微调

@LSQii
Copy link

LSQii commented May 30, 2021

您好,我用了新的轉換腳本,後續有新的錯誤
Traceback (most recent call last):
File "run_kbert_cls.py", line 625, in
main()
File "run_kbert_cls.py", line 588, in main
loss, _ = model(input_ids_batch, label_ids_batch, mask_ids_batch, pos=pos_ids_batch, vm=vms_batch)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "run_kbert_cls.py", line 53, in forward
output = self.encoder(emb, mask, vm)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/workplace/phchen/K-BERT-master/uer/encoders/bert_encoder.py", line 48, in forward
hidden = self.transformer[i](hidden, mask)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/workplace/phchen/K-BERT-master/uer/layers/transformer.py", line 38, in forward
inter = self.dropout_1(self.self_attn(hidden, hidden, hidden, mask))
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in forward
query, key, value = [l(x).
File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in
query, key, value = [l(x).
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 91, in forward
return F.linear(input, self.weight, self.bias)
File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/functional.py", line 1676, in linear
output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [4096 x 128], m2: [768 x 768] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

使用之config.json為
{"emb_size": 128, "feedforward_size": 3072, "hidden_size": 768, "heads_num": 12, "layers_num": 12, "dropout": 0.0}
想請問m1與m2指的為何? 該如何解決問題 萬分謝謝您的時間 謝謝 @hhou435

hello冒昧打扰一下,想问一下你这里问题解决了嘛,我在用albert做pretrain model的时候出现了相同的问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants