Finetuned BERT model does not seem to predict right labels/work properly?
See original GitHub issue❓ Questions & Help
I am trying out a finetuned BERT model for token classification (–> https://huggingface.co/bert-base-cased-finetuned-conll03-english), but when I observe the model output (i.e. the logits after applying the softmax) and compare it with the true label_ids, they are totally uncorrelated (see pictures).
https://i.stack.imgur.com/gVyMn.png https://i.stack.imgur.com/qS62L.png
Details
I assume that the finetuned model (bert-base-cased-finetuned-conll03-english) is correctly pretrained, but I don’t seem to understand why its predictions are off. I think one issue is that the pretrained model has another labelling scheme than I made myself during data prep (so that the tag2name dict is different), but I don’t know how I can find out what label-index map the model uses for its predictions. Even then it is not the case that the model consistently makes the same mistakes, it is outputting things quite randomly.
Any idea what the issue could be?
``
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (2 by maintainers)
Hi! From my experience using the community-contributed
dbmdz/bert-large-cased-finetuned-conll03-english
(which is the same checkpoint as)bert-large-cased-finetuned-conll03-english
, using thebert-base-cased
tokenizer instead of the tokenizer loaded from that checkpoint works better.You can see an example of this in the usage, let me know if it helps.
I suspect the difference between the tokenizers is due to a lowercasing of all inputs. I’m looking into it now.
PS: the file
bert-large-cased-finetuned-conll03-english
is deprecated in favor of the aforementionneddbmdz/bert-large-cased-finetuned-conll03-english
as they are duplicates. @julien-c is currently deleting it from the S3, please use thedbmdz
file/folder.This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.