Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pretrained_models.md #13544

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions official/nlp/docs/pretrained_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ models.

### How to Initialize from Checkpoint

**Note:** TF-HUB/Savedmodel is the preferred way to distribute models as it is
self-contained. Please consider using TF-HUB for finetuning tasks first.
**Note:** TF-HUB/Kaggle-Savedmodel is the preferred way to distribute models as it is
self-contained. Please consider using TF-HUB/Kaggle for finetuning tasks first.

If you use the [NLP training library](train.md),
you can specify the checkpoint path link directly when launching your job. For
Expand All @@ -29,10 +29,10 @@ python3 train.py \
--params_override=task.init_checkpoint=PATH_TO_INIT_CKPT
```

### How to load TF-HUB SavedModel
### How to load TF-HUB/Kaggle SavedModel

Finetuning tasks such as question answering (SQuAD) and sentence
prediction (GLUE) support loading a model from TF-HUB. These built-in tasks
prediction (GLUE) support loading a model from TF-HUB/Kaggle. These built-in tasks
support a specific `task.hub_module_url` parameter. To set this parameter,
replace `--params_override=task.init_checkpoint=...` with
`--params_override=task.hub_module_url=TF_HUB_URL`, like below:
Expand All @@ -54,7 +54,7 @@ in order to keep consistent with BERT paper.

### Checkpoints

Model | Configuration | Training Data | Checkpoint & Vocabulary | TF-HUB SavedModels
Model | Configuration | Training Data | Checkpoint & Vocabulary | Kaggle SavedModels
---------------------------------------- | :--------------------------: | ------------: | ----------------------: | ------:
BERT-base uncased English | uncased_L-12_H-768_A-12 | Wiki + Books | [uncased_L-12_H-768_A-12](https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/uncased_L-12_H-768_A-12.tar.gz) | [`BERT-Base, Uncased`](https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/)
BERT-base cased English | cased_L-12_H-768_A-12 | Wiki + Books | [cased_L-12_H-768_A-12](https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/cased_L-12_H-768_A-12.tar.gz) | [`BERT-Base, Cased`](https://tfhub.dev/tensorflow/bert_en_cased_L-12_H-768_A-12/)
Expand All @@ -74,7 +74,7 @@ We also have pretrained BERT models with variants in both network architecture
and training methodologies. These models achieve higher downstream accuracy
scores.

Model | Configuration | Training Data | TF-HUB SavedModels | Comment
Model | Configuration | Training Data | Kaggle SavedModels | Comment
-------------------------------- | :----------------------: | -----------------------: | ------------------------------------------------------------------------------------: | ------:
BERT-base talking heads + ggelu | uncased_L-12_H-768_A-12 | Wiki + Books | [talkheads_ggelu_base](https://tfhub.dev/tensorflow/talkheads_ggelu_bert_en_base/1) | BERT-base trained with [talking heads attention](https://arxiv.org/abs/2003.02436) and [gated GeLU](https://arxiv.org/abs/2002.05202).
BERT-large talking heads + ggelu | uncased_L-24_H-1024_A-16 | Wiki + Books | [talkheads_ggelu_large](https://tfhub.dev/tensorflow/talkheads_ggelu_bert_en_large/1) | BERT-large trained with [talking heads attention](https://arxiv.org/abs/2003.02436) and [gated GeLU](https://arxiv.org/abs/2002.05202).
Expand All @@ -96,12 +96,12 @@ ALBERT repository.

### Checkpoints

Model | Training Data | Checkpoint & Vocabulary | TF-HUB SavedModels
Model | Training Data | Checkpoint & Vocabulary | Kaggle SavedModels
---------------------------------------- | ------------: | ----------------------: | ------:
ALBERT-base English | Wiki + Books | [`ALBERT Base`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_base.tar.gz) | https://tfhub.dev/tensorflow/albert_en_base/3
ALBERT-large English | Wiki + Books | [`ALBERT Large`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_large.tar.gz) | https://tfhub.dev/tensorflow/albert_en_large/3
ALBERT-xlarge English | Wiki + Books | [`ALBERT XLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xlarge.tar.gz) | https://tfhub.dev/tensorflow/albert_en_xlarge/3
ALBERT-xxlarge English | Wiki + Books | [`ALBERT XXLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xxlarge.tar.gz) | https://tfhub.dev/tensorflow/albert_en_xxlarge/3
ALBERT-base English | Wiki + Books | [`ALBERT Base`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_base.tar.gz) | [albert_en_base](https://tfhub.dev/tensorflow/albert_en_base/3)
ALBERT-large English | Wiki + Books | [`ALBERT Large`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_large.tar.gz) | [albert_en_large](https://tfhub.dev/tensorflow/albert_en_large/3)
ALBERT-xlarge English | Wiki + Books | [`ALBERT XLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xlarge.tar.gz) | [albert_en_xlarge](https://tfhub.dev/tensorflow/albert_en_xlarge/3)
ALBERT-xxlarge English | Wiki + Books | [`ALBERT XXLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xxlarge.tar.gz) | [albert_en_xxlarge](https://tfhub.dev/tensorflow/albert_en_xxlarge/3)


## ELECTRA
Expand Down