Skip to content

Conversation

@huisman
Copy link
Contributor

@huisman huisman commented Oct 29, 2025

Fix topic labels when all documents match zero-shot topics by updating the topic id mapping.

What does this PR do?

When all documents match zero-shot topics, the labels for the topics are not set to the zero-shot topic name.
This PR tries to fix this by updating the topic id/mapping and updating the topic sizes when there are no non-zero-shot topics.

Fixes #2447

Before submitting

  • This PR fixes a typo or improves the docs (if yes, ignore all other checks!).
  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes (if applicable)?
  • Did you write any new necessary tests?

Fix topic labels when all documents match zero-shot topics by updating the topic id mapping even when all documents are matched with zero-shot topics
@MaartenGr
Copy link
Owner

Thank you for the PR! Could you perhaps give an example of how it fixes the issue? That way, I can run some tests.

Also, did you test whether the predictions are still correct? I wonder whether the update topics would properly propagate the predictions through the mapping.

Lastly, I think you duplicated this:

# All documents matches zero-shot topics
documents = assigned_documents
embeddings = assigned_embeddings

Which might break things depending on how it was ordered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Topic names not set to zero-shot topic when fit_transform results in only zero-shot topics

2 participants