Skip to content

How to use lemmatizer with span categorizer? #9201

Discussion options

You must be logged in to vote

Do you want to train the tagger, or just use the pre-trained tagger from an existing pipeline? I assume you want to do the latter, in which case you should have something like this:

[components.tagger]
source = "en_core_web_lg"

And you can repeat that for the attribute ruler and lemmatizer. That will load all those components from the existing pipeline without changes. See this section in the docs. (You probably also want to freeze the components since you don't need them when training the spancat.)

The error is happening because you are using a blank tagger and your pipeline has no tok2vec for it to get input from - your only tok2vec is the one included inside the spancat, which isn't a…

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
4 replies
@mbrunecky
Comment options

@mbrunecky
Comment options

@polm
Comment options

@adrianeboyd
Comment options

Answer selected by svlandeg
Comment options

You must be logged in to vote
1 reply
@mbrunecky
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / lemmatizer Feature: Rule-based and lookup lemmatization
3 participants