German NER using BERT

This project consist of the following tasks:

Fine-tune German BERT on Legal Data,
Create a minimal front-end that accepts a German sentence and shows its NER analysis.

1. Fine-tune German BERT on Legal Data

The entire process of fine-tuning German BERT on Legal Data is available in german_bert_ner.ipynb.
This notebook also contains abstract descriptions whenever deemed necessary.

2. Execute the minimal front-end

To run this project on localhost, follow these simple steps:

Create a virtual enviroment using:

conda create -n german_bert_ner python=3.9

Activate this virtual enviroment:

conda activate german_bert_ner

Clone this repo:

git clone https://github.com/harshildarji/German-NER-BERT.git

cd to repo:

cd German-NER-BERT

Install required packages using:

pip3 install -r requirements.txt

Next, we need three important files; model.pt, tag_values.pkl, and tokenizer.pkl. One can either generate these files by executing through german_bert_ner.ipynb which will take 45-60 minutes or download the latest versions of these files from my DropBox using:

wget https://www.dropbox.com/s/vos8pqwmlbqe0wf/model.pt
wget https://www.dropbox.com/s/u2oojgmmprt0a9d/tag_values.pkl
wget https://www.dropbox.com/s/uj15pab78emefoq/tokenizer.pkl

Once above-mentioned files are generated/downloaded, run app.py as:

python3 app.py

Once app.py is successfully executed, head over to http://localhost:5000/.
In the provided text-area, input a German (law) sentence, for example: 1. Das Bundesarbeitsgericht ist gemäß § 9 Abs. 2 Satz 2 ArbGG iVm. § 201 Abs. 1 Satz 2 GVG für die beabsichtigte Klage gegen den Bund zuständig .
Final output:

References:

Leitner, Elena, Georg Rehm, and Julián Moreno-Schneider. "A dataset of german legal documents for named entity recognition." arXiv preprint arXiv:2003.13016 (2020).

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
app.py		app.py
convert_conll.py		convert_conll.py
german_bert_ner.ipynb		german_bert_ner.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

German NER using BERT

1. Fine-tune German BERT on Legal Data

2. Execute the minimal front-end

References:

About

Uh oh!

Languages

License

harshildarji/German-NER-BERT

Folders and files

Latest commit

History

Repository files navigation

German NER using BERT

1. Fine-tune German BERT on Legal Data

2. Execute the minimal front-end

References:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages