Skip to content

Commit 6b59efc

Browse files
committedDec 12, 2020
added models and readme in data folder
1 parent 877ae8e commit 6b59efc

15 files changed

+47291
-1
lines changed
 

‎.gitignore

-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
11
venv
22
.ipynb_checkpoints
3-
models/
43
.idea

‎nlp/1-ner/data/README.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Data for training
2+
3+
In this directory you can find data related with dataset [Annotated Corpus for Named Entity Recognition
4+
](https://www.kaggle.com/abhinavwalia95/entity-annotated-corpus?select=ner_dataset.csv) from Kaggle.
5+
6+
In particular the dataset is present in three different format.
7+
8+
* [iob scheme](iob_ner_dataset.csv): not used in notebooks but you can use this in place
9+
of biluo dataset in [bilstm ner notebook](../notebooks/bilstm%20ner.ipynb)
10+
* [biluo scheme](biluo_ner_dataset.csv): used in [bilstm ner notebook](../notebooks/bilstm%20ner.ipynb)
11+
* [jsonl scheme](ner_jsonò_dataset.pickle): used in [spacy ner notebook](../notebooks/spacy%20ner.ipynb)
12+
13+
In this directory you can find also a notebook with code to transform dataset from a format to another.
14+
In particular you can transform dataset from iob to biluo scheme, from biluo scheme to jsonl format and
15+
vice versa.
File renamed without changes.

‎nlp/1-ner/models/bilstm.h5

42.2 MB
Binary file not shown.

‎nlp/1-ner/models/index_to_tag.pickle

358 Bytes
Binary file not shown.

‎nlp/1-ner/models/ner_model/meta.json

+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
{
2+
"lang":"en",
3+
"name":"ner_model",
4+
"version":"0.0.0",
5+
"spacy_version":">=2.3.4",
6+
"description":"",
7+
"author":"",
8+
"email":"",
9+
"url":"",
10+
"license":"",
11+
"spacy_git_version":"6fb3e4796",
12+
"vectors":{
13+
"width":0,
14+
"vectors":0,
15+
"keys":0,
16+
"name":"spacy_pretrained_vectors"
17+
},
18+
"pipeline":[
19+
"ner"
20+
],
21+
"factories":{
22+
"ner":"ner"
23+
},
24+
"labels":{
25+
"ner":[
26+
"art",
27+
"eve",
28+
"geo",
29+
"gpe",
30+
"nat",
31+
"org",
32+
"per",
33+
"tim"
34+
]
35+
}
36+
}

‎nlp/1-ner/models/ner_model/ner/cfg

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"beam_width":1,
3+
"beam_density":0.0,
4+
"beam_update_prob":1.0,
5+
"cnn_maxout_pieces":3,
6+
"nr_feature_tokens":6,
7+
"nr_class":34,
8+
"hidden_depth":1,
9+
"token_vector_width":96,
10+
"hidden_width":64,
11+
"maxout_pieces":2,
12+
"pretrained_vectors":null,
13+
"bilstm_depth":0,
14+
"self_attn_depth":0,
15+
"conv_depth":4,
16+
"conv_window":1,
17+
"embed_size":2000
18+
}

‎nlp/1-ner/models/ner_model/ner/model

3.82 MB
Binary file not shown.

‎nlp/1-ner/models/ner_model/ner/moves

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
��moves�P{"0":{},"1":{"geo":-1,"tim":-2,"gpe":-3,"per":-4,"org":-5,"art":-6,"nat":-7,"eve":-8},"2":{"geo":-1,"tim":-2,"gpe":-3,"per":-4,"org":-5,"art":-6,"nat":-7,"eve":-8},"3":{"geo":-1,"tim":-2,"gpe":-3,"per":-4,"org":-5,"art":-6,"nat":-7,"eve":-8},"4":{"":1,"geo":-1,"tim":-2,"gpe":-3,"per":-4,"org":-5,"art":-6,"nat":-7,"eve":-8},"5":{"":1}}

‎nlp/1-ner/models/ner_model/tokenizer

+4
Large diffs are not rendered by default.
+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
��lexeme_norm�

0 commit comments

Comments
 (0)
Please sign in to comment.