Skip to content
This repository has been archived by the owner on Feb 1, 2024. It is now read-only.

Decide on a path forward for thefuzz dependency #2010

Open
nanotubing opened this issue Jul 29, 2022 · 1 comment
Open

Decide on a path forward for thefuzz dependency #2010

nanotubing opened this issue Jul 29, 2022 · 1 comment
Assignees
Labels
task 43 infrastructure tech debt Idetifies opportunities for code refactoring and cleanup
Milestone

Comments

@nanotubing
Copy link
Contributor

Library thefuzz, by default has a dependency on Sequencemacher, from the difflib module. While using it, this module will print out a warning that this is a slow implementation, and recommends using python-Levenshtein instead.

This warning has been suppressed in #2009.

We should decide on whether a more permanent solution makes sense for this project. These two options seem possible so far:

  1. The recommended module python-Levenshtein is an older module written in C, and they are asking for maintainers, so there is doubt surrounding whether or not this project is actively maintained, and how long it will continue to build:

I am looking for a new maintainer to the project as it is apparent that I haven't had the need for this particular library for well over 7 years now, due to it being a C-only library and its somewhat restrictive original license.

  1. There is also an open Pull Request to replace python-Levenshtein with rapidfuzz, which appears to be an actively maintained, more modern implementation
@nanotubing nanotubing added the tech debt Idetifies opportunities for code refactoring and cleanup label Jul 29, 2022
@maxbachmann
Copy link

The recommended module python-Levenshtein is an older module written in C, and they are asking for maintainers, so there is doubt surrounding whether or not this project is actively maintained, and how long it will continue to build:

I actively maintain a fork of the python-Levenshtein project: https://github.com/maxbachmann/Levenshtein

There is also an open seatgeek/thefuzz#10 to replace python-Levenshtein with rapidfuzz, which appears to be an actively maintained, more modern implementation

This is probably the simpler variant, since it just requires you to update the import

@jwalgran jwalgran self-assigned this Aug 18, 2022
@obrienad obrienad added the task 43 infrastructure label Sep 8, 2022
@obrienad obrienad added this to the Q3 2022 milestone Sep 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
task 43 infrastructure tech debt Idetifies opportunities for code refactoring and cleanup
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants