Scripts for testing lexicony stuff in giellalt plus some processing lexc python scripts.
Uses pyhfst to load HFST automata. Run poetry install
to install dependencies.
Spell-checker testing uses
divvunspell binaries. You can install
divvunspell with cargo.
You can install giellaltlextools with pipx: pipx install git+https://github.com/divvun/giellaltlextools
.
This project uses Poetry's build system to ensure optimal pyhfst installation.
The project is configured to automatically optimize pyhfst
installation with Cython for better performance:
- Build System: Declares Cython as a build-time requirement
- Build Script:
scripts/build.py
automatically handles pyhfst optimization - Dependencies: Cython is included as both a runtime and build dependency
The build script runs automatically during poetry install
and poetry build
, ensuring pyhfst is always installed with Cython support when available.
Mainly from make check
in GiellaLT infra.
There are currently three programs installed:
gtlemmatest
for testing that a generator generates lemmas found from a lexc filegtparadigmteset
for testing that a generator generates full paradigms of the lemmasgtspelltest
for testing that a spell checker accepts lemmas from lexc files.
$ gtlemmatest -l src/fst/morphology/stems/nouns.lexc \
-a src/fst/analyser-gt-desc.hfstol \
-g src/fst/generator-gt-desc.hfstol \
-t +N+Sg+Nom -t +N+Pl+Nom
The lexc files should mainly contain lexc lines that contain full lemma forms.
$ gtparadigmtest -l src/fst/morphology/stens/nouns.lexc \
-p src/fst/morphology/test/testnounparadigm.txt \
-g src/fst/generator-gt-desc.hfstol
The lexc files should mainly contain lexc lines that contain full lemma forms.
$ gtspelltest -z tools/spellcheckers/se.zhfst -D divvunspell \
src/fst/morphology/stems/*.lexc
The lexc files should mainly contain lexc lines that contain full lemma forms.