Search a set of strings/documents with fuzzy matching
import FuzzySet from 'fuzzyset';
const docs = ['afghanistan', 'albania', 'algeria', ...]
const set = new FuzzySet(docs);
const results = set.query('al');
for (const [i, score] of results) {
console.log(countries[i], score);
}
// Albania 1.0
// Algeria 0.5
// ...
Instantiate a set, given docs and options.
Array of strings, each of which could be a simple title, or the entire text of a document, or a list of keywords.
-
synonyms
- Object with keys as alias terms pointing to canonical terms. Ex:{ nyc: 'New York City', philly: 'Philadelphia' }
-
stopwords
- Array of terms that should be ignored for having no semantic meaning. Ex:['a', 'an', 'of', 'the', '&']
-
tokenMatchFactor
- Weight to give to matching normalized tokens. Default0.5
. -
tokenPrefixFactor
- Weight to give to terms matching prefixes. Default0.25
. -
prefixFactor
- Weight to give to direct document prefix term matching. Default0.12
-
vaccuumFactor
- Weight to give to consonant terms with vowels removed. Default0.8