AI model for resolving duplicities in Czech National Oncology Register. Bachelor's thesis about this project is available here.
- Python
$\ge 3.10$
Install the package either by:
- Downloading the repository as a ZIP file by clicking the on the badge at the beginning of this
READMEfile. - Cloning the repository
Run the file install.ps1 either by right-clicking and selecting Run with PowerShell or by running the command in PowerShell:
.\install.ps1Create a virtual environment and activate it (optional but recommended):
python -m venv venvActivate the virtual environment:
source venv/bin/activateInstall the requirements:
pip install .Edit the paths in the scripts/constants.py file, e.g. paths to the data or model.
In terminal, run the command:
nor-cleaner [-h] {prepare,train,predict,evaluate} ...- Prepare the data for training.
nor-cleaner prepare- Train the model.
nor-cleaner train- Predict whether to preserve or drop a record.
nor-cleaner predict- (Optional) Evaluate the model using cross-validation on the training data.
nor-cleaner evaluateNOTE: use the constants file in scripts/constants.py to set paths.