Add rougel for cross-device evaluation #768

qbc2016 · 2024-04-10T02:39:44Z

To assess the performance of our fine-tuned model, we leverage the Rouge-L metric and conduct experiments with a large number of clients, utilizing the Dolly-15K dataset as our training corpus.
The Dolly-15K dataset encompasses a total of 15,015 data points, distributed across eight distinct tasks. For a more comprehensive evaluation, we allocate the final task exclusively for evaluation purposes, while dedicating the remaining ones to the training phase.
Our experimental setup involves a network of 200 clients, utilizing a Dirichlet distribution for data partitioning to emulate non-IID conditions across the client base.

To do the evaluation, run

python federatescope/eval/eval_for_rougel/eval.py --cfg federatescope/llm/baselime/xxx.yaml

add rougel for dolly

0eada48

qbc2016 requested a review from rayrayraykk April 10, 2024 02:39

rayrayraykk changed the title ~~Add rougel for dolly~~ Add rougel for cross-device evaluation Apr 10, 2024

qbc2016 added 3 commits April 10, 2024 11:05

add readme, eval for the summarization task

1fbdb8b

add eval for NI dataset

546224b

add readme

38757f8

qbc2016 added ready_for_review [**IMPORTANT**] The pull request is ready for review, which will trigger the unit tests. FS-LLM Federated learning in LLM labels Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add rougel for cross-device evaluation #768

Add rougel for cross-device evaluation #768

Uh oh!

qbc2016 commented Apr 10, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add rougel for cross-device evaluation #768

Are you sure you want to change the base?

Add rougel for cross-device evaluation #768

Uh oh!

Conversation

qbc2016 commented Apr 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qbc2016 commented Apr 10, 2024 •

edited

Loading