Skip to content

Commit 2dcc435

Browse files
committed
fix on-policy data order
1 parent 54545e8 commit 2dcc435

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

on_policy_data_gen/decode.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030

3131
train_dataset= load_dataset(data_dir, split='train_prefs')
3232

33-
prompts = list(set(train_dataset['prompt']))
33+
prompts = sorted(list(set(train_dataset['prompt'])))
3434

3535
conversations = [tokenizer.apply_chat_template([{'role': 'user', 'content': prompt}], tokenize=False, add_generation_prompt=True) for prompt in prompts]
3636

0 commit comments

Comments
 (0)