Tuning scoring in a config file, and want to specify cats_f_per_type #9282
-
Hello all, I have a two-category classifier that I want to tune to care more about positive than negative examples.
Here's my results in meta.json in the model-best weights: As you can imagine, this seems overly tuned and not ideal. Is there a way to specify scoring weights per type? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The weights here are only used for the eval display and early stopping, not within the component itself while training. On the component config side of things, you can try adjusting You can also provide a And from these results I would guess that your dataset is imbalanced? There are number of techniques you can try on the training data, like oversampling. There's nothing directly in the spacy library to support this, but you can look for general advice about dealing with imbalanced datasets. |
Beta Was this translation helpful? Give feedback.
The weights here are only used for the eval display and early stopping, not within the component itself while training.
On the component config side of things, you can try adjusting
threshold
in the config.You can also provide a
positive_label
in your config (in the[initialize.components.textcat]
block) to have the overallcats_score
that's used for early stopping be the positive F rather than the average of positive and negative F. You would need to add a positive weight tocats_score
in the score weights above. See: https://spacy.io/api/textcategorizer#pipe (the anchor in the link looks odd, but look for the example next toTextCategorizer.initialize
)And from these results I would gu…