No result difference using word2vec vectors for training NER in spacy 3 #9334

sagorbrur · 2021-09-30T03:48:15Z

sagorbrur
Sep 30, 2021

Hello,
I have trained spacy NER model(pipeline(tok2vec, ner)) with my datasets and custom word vectors.
I prepared two training setups:

with default settings and with pre-trained word2vec vectors
with default settings and without pre-trained word2vec vectors

I have trained both models 100 epochs in CPU with the same data. But my test result producing a similar F1 score 92.35. Same validation result also: 91.50.
It's like word2vec vectors provide no impact in my training. I have tried similar approach in spacy 2.3.5, the result was too different with improvements.
Is there any modification that happened inside the NER pipeline training module with custom vectors?
Please let me know.
Thanks in advance.

NB: I have prepared the vector using init vectors commands

How to reproduce the behaviour

Train-1

python -m spacy train config.cfg \
--output ./logs/train1 \
--paths.train ./data/traindata/train.spacy \
--paths.dev ./data/traindata/val.spacy \
--paths.vectors ./vector/mycustom_vector_md \
-g -1

Train-2

python -m spacy train config.cfg \
--output ./logs/train2 \
--paths.train ./data/traindata/train.spacy \
--paths.dev ./data/traindata/val.spacy \
-g -1

Your Environment

Operating System: Ubuntu 20.04
Python Version Used: 3.8
spaCy Version Used: 3.1.2
Environment Information:

Answered by sagorbrur

Oct 1, 2021

I have successfully regenerated my test results just like spacy 2(Not exact but absolute value is now similar).
Solution I found in spacy documentation https://spacy.io/api/architectures#parser
While generating base_config.cfg it's automatically selecting the tok2vec.model as

[components.ner.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

According to the documentation I replaced it by

@architectures = "spacy.HashEmbedCNN.v2"
pretrained_vectors = null
width = 96
depth = 4
embed_size = 2000
window_size = 1
maxout_pieces = 3
subword_features = true

Now it's working just like spacy 2.

View full answer

polm · 2021-09-30T04:13:58Z

polm
Sep 30, 2021

I suspect you are not enabling the use of static vectors. See the docs - is include_static_vectors true?

0 replies

sagorbrur · 2021-09-30T04:38:33Z

sagorbrur
Sep 30, 2021
Author

Hi @polm ,
Thank you for your quick reply.
I will try with include_static_vectors as true and train again.
I will let you know the results.
Thanks again.

0 replies

sagorbrur · 2021-09-30T08:50:05Z

sagorbrur
Sep 30, 2021
Author

Hi @polm ,
After making include_static_vectors=true result degraded:
test f1: 91.85 , val f1: 91.34 .
Is there any suggestion to reproduce similar result like spacy 2? please let me know.
NB: I used spacy2 default parameters. No fine-tuning did except adding word vectors.
thanks.

5 replies

polm Sep 30, 2021

Uh, I'm not entirely sure what you mean by "spacy2 default parameters". Also you say performance degraded but it looks like it was half a point of F1? That suggests to me that your vectors just aren't making much of a difference, which can happen sometimes. What kind of results are you expecting?

polm Sep 30, 2021

Also just to be perfectly clear - is your "test" score for actual held-out test data, or is that your training score?

sagorbrur Sep 30, 2021
Author

spacy 2 defaults parameter means, I didn't change any parameter like learning rate, hidden_width, dropout value etc in spacy 2 training.
similarly I didn't change any default parameter in spacy 3 training. Only added vectors path and include_static_vectors=true .
My previous F1 score was 93.12 which is now 91.85 in actual held-out test data.
degradation: 1.27 . is that normal?
Please let me know.
thanks and regards

polm Sep 30, 2021

Without knowing anything about how you trained your vectors it's hard to say, but 1.27% is not a big change. So I don't think that's surprising or a problem or anything. Maybe try fiddling with your word2vec hyperparameters? Once you're over 90F1 it's very hard to get improvement without overfitting.

sagorbrur Sep 30, 2021
Author

Thank you so much @polm .

sagorbrur · 2021-10-01T09:05:59Z

sagorbrur
Oct 1, 2021
Author

I have successfully regenerated my test results just like spacy 2(Not exact but absolute value is now similar).
Solution I found in spacy documentation https://spacy.io/api/architectures#parser
While generating base_config.cfg it's automatically selecting the tok2vec.model as

[components.ner.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

According to the documentation I replaced it by

@architectures = "spacy.HashEmbedCNN.v2"
pretrained_vectors = null
width = 96
depth = 4
embed_size = 2000
window_size = 1
maxout_pieces = 3
subword_features = true

Now it's working just like spacy 2.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No result difference using word2vec vectors for training NER in spacy 3 #9334

{{title}}

Replies: 4 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

No result difference using word2vec vectors for training NER in spacy 3 #9334

sagorbrur Sep 30, 2021

How to reproduce the behaviour

Your Environment

Replies: 4 comments · 5 replies

polm Sep 30, 2021

sagorbrur Sep 30, 2021 Author

sagorbrur Sep 30, 2021 Author

polm Sep 30, 2021

polm Sep 30, 2021

sagorbrur Sep 30, 2021 Author

polm Sep 30, 2021

sagorbrur Sep 30, 2021 Author

sagorbrur Oct 1, 2021 Author

sagorbrur
Sep 30, 2021

Replies: 4 comments 5 replies

polm
Sep 30, 2021

sagorbrur
Sep 30, 2021
Author

sagorbrur
Sep 30, 2021
Author

sagorbrur Sep 30, 2021
Author

sagorbrur Sep 30, 2021
Author

sagorbrur
Oct 1, 2021
Author