You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/config.qmd
+2-1
Original file line number
Diff line number
Diff line change
@@ -166,14 +166,15 @@ datasets:
166
166
# IMPORTANT: The following fields determine which parts of the conversation to train on.
167
167
# Priority order: message_field_training > message_field_training_detail > train_on_inputs or role in roles_to_train
168
168
# See examples at `docs/dataset-formats/conversation.qmd`
169
-
# Note: If the below 4 fields are empty, defaults to training only on the last message.
169
+
# Note: If the below 4 fields are set to empty, defaults to training only on the last message.
170
170
171
171
# Optional[List[str]]. Roles to train on. The tokens from these roles will be considered for the loss.
172
172
roles_to_train: ["assistant"] # default
173
173
# Optional[str]. Which EOS tokens to train on in the conversation. Possible values are:
174
174
# - all: train on all EOS tokens
175
175
# - turn (default): train on the EOS token at the end of each trainable turn
176
176
# - last: train on the last EOS token in the conversation
177
+
# TIP: Please make sure that your `tokenizer.eos_token` is same as EOS/EOT token in template. Otherwise, set `eos_token` under `special_tokens`.
177
178
train_on_eos: last
178
179
# The key in the message turn that indicates via boolean whether tokens of a turn should be considered for training. Useful to selectively train on certain turns besides the `roles_to_train`.
**Q: `Content end boundary is the same as start boundary for turn ___. This is likely an empty turn.`**
47
47
48
48
> A: This is likely an empty turn.
49
+
50
+
**Q: The EOS/EOT token is incorrectly being masked or not being masked.**
51
+
52
+
> A: This is because of the mismatch between `tokenizer.eos_token` and EOS/EOT token in template. Please make sure to set `eos_token` under `special_tokens` to the same EOS/EOT token as in template.
0 commit comments