Vllm2 #27

darrenearl · 2025-02-20T11:16:32Z

About merge the vllm eval, I need some help.

… to be optimized.

Feature/chat cli

Added ppl and few-shot evaluation scripts.

…odifications.

Feature/evaluation

…eenBitAI/green-bit-llm into feature/langchain_integration

…en 2.5

…eenBitAI/green-bit-llm into feature/langchain_integration

Feature/langchain integration

darrenearl · 2025-02-20T11:21:52Z

About merge the vllm eval, I need some help.

I have some questions about code specifications, such as API names and some parameters, as well as which feature points need to be improved. Can you give me some suggestions?

Jopyth

We probably need to rethink some parts about the installation process as a whole, but in the meantime I think we should add some brief documentation about the installation process if VLLM is to be used. (Install the corresponding requirements.txt files after installation of green-bit-llm?) Maybe add a README to third_party/vllm or add a section to the main readme about this.

Please also always add a newline to the end of the file.

Jopyth · 2025-03-10T16:13:43Z

green_bit_llm/evaluation/README.md

-python -m green_bit_llm.evaluation.evaluate --model GreenBitAI/Qwen-1.5-4B-layer-mix-bpw-3.0 --trust-remote-code --eval-ppl --ppl-tasks wikitext2,c4,ptb
+python -m green_bit_llm.evaluation.evaluate --model GreenBitAI/Qwen-1.5-4B-layer-mix-bpw-3.0 --backend greenbit-engine --trust-remote-code --eval-ppl --ppl-tasks wikitext2,c4,ptb
+```
+or


I think some explanation is needed. We could add some information here on the significance of this choice. I.e. what is VLLM, why/when to use it, or some link. Or all of those.

Jopyth · 2025-03-10T16:15:40Z

green_bit_llm/evaluation/evaluate.py


 from lm_eval import evaluator
-
+from vllm.model_executor.layers.logits_processor import _apply_logits_processors


I assume there is no "safe" alternativ instead of using this internal method, right? If so, I think we can use it.

Jopyth · 2025-03-10T16:17:18Z

green_bit_llm/evaluation/evaluate.py

+        lm_head, hidden_states, sampling_metadata, *embedding_bias = input
+        embedding_bias = embedding_bias[0] if embedding_bias else None
+        logits = module._get_logits(hidden_states, lm_head, embedding_bias)
+        if logits is not None:


I think we should use a guard clause here.

Jopyth · 2025-03-10T16:20:34Z

green_bit_llm/evaluation/evaluate.py

+            shift_labels = testenc[:, (i * seqlen): ((i + 1) * seqlen)][
+                            :, 1:
+                            ].to(device)


Please adjust the styling (see https://peps.python.org/pep-0008/#indentation or follow black style).

Jopyth · 2025-03-10T16:22:13Z

green_bit_llm/evaluation/evaluate.py

 DEFAULT_SEQLEN = 2048
 DEFAULT_RANDOM_SEED = 0
 DTYPE = torch.half
+DEFAULT_MODEL_BCKEND = ["vllm", "greenbit-engine"]


The name seems it should be AVAILABLE_MODEL_BACKENDS. It could be expected that the first option is the default, or we also declare the default model backend separately.

Jopyth · 2025-03-10T16:23:14Z

green_bit_llm/inference/backends/green_bit_backend.py

+class GBLLMInferenceBackend(BaseInferenceBackend):
+    def __init__(self, model_path, **kwargs):
+        # Building configs
+        tokenizer_config = {"trust_remote_code": True if kwargs.get("trust_remote_code") else None}


I think we should use kwargs.get("trust_remote_code", None) to set a default. Similar below for the same and other arguments.

yanghaojin and others added 30 commits April 5, 2024 23:13

Initial commit

224a9c0

added dirs and other files

bf30889

added content for readme

9587436

adapted project structure

97ed852

added model loading

c19328f

added inference codes

c9033a1

created simple generation

292836b

move enum to a separated file.

45a0bef

Merge branch 'main' of https://github.com/GreenBitAI/green-bit-llm

9467162

fixed issue in loading Qwen2 models

97aff25

Merge branch 'main' of https://github.com/GreenBitAI/green-bit-llm

d88a3a9

added token by token generation simple example. The performance needs…

796e9fe

… to be optimized.

simplified the simple generation demo.

a31cb4d

add ppl evaluation and few-shot evaluation

a5530ce

update

3071f15

update

6f62461

update

bc390b4

fix args issue

6102146

fix few-shot task naming issue

d5e95bb

created simple tool for chat demo using commandline with fastchat cli

2771891

created README.md for inference package

72b9706

created chat cli, improved simple generation script.

d2a1da4

Merge pull request #2 from GreenBitAI/feature/chat-cli

b01ab01

Feature/chat cli

Merge pull request #1 from GreenBitAI/feature/evaluation

06a6f17

Added ppl and few-shot evaluation scripts.

add lm_eval version

640771e

fix lm_eval version issue

b74227a

Added an additional library installation information and some minor m…

3942946

…odifications.

Merge pull request #3 from GreenBitAI/feature/evaluation

40decd4

Feature/evaluation

improved inference package, readme

faec589

added draft setup.py

32cf1e4

Haojin Yang and others added 28 commits November 26, 2024 17:35

Merge branch 'feature/langchain_integration' of https://github.com/Gr…

68ff6e5

…eenBitAI/green-bit-llm into feature/langchain_integration

Added token usage information into stream methods.

d0f2daa

Finally fixed issue in pipline and chat model class for llama3 and qw…

4dc4fdd

…en 2.5

Merge branch 'feature/langchain_integration' of https://github.com/Gr…

592b072

…eenBitAI/green-bit-llm into feature/langchain_integration

Now the fastapi server works!

579d59e

Updated testing script for api server.

b2d917e

Merge branch 'feature/langchain_integration' of https://github.com/Gr…

cacf49c

…eenBitAI/green-bit-llm into feature/langchain_integration

Merge pull request #24 from GreenBitAI/feature/langchain_integration

c88ee66

Feature/langchain integration

add the vllm v0.5.1

e4ccfc1

add the vllm v0.5.1

77d2907

add the llm_inference.py

88fa4cc

add the llm_inference.py

d3ec275

add vllm requirements.txt

11adebd

add vllm requirements.txt

92e7a52

remove vllm

c5ca4e6

remove vllm

53cebfe

add vllm

4f85abd

add vllm

bda2364

add the evaluate_vllm.py

f716986

merge the vllm and greenbit-engine backend

241db9d

remove the cache

55d6a79

Update .gitignore

ec36ea6

modify the .gitignore

9b3cedb

ignore the log

9155413

ignore the log

4996c46

remove log

4599cf2

Merge branch 'vllm' of github.com:GreenBitAI/green-bit-llm into HEAD

d62a2b5

modify the README.md

2f79405

Jopyth requested changes Mar 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vllm2 #27

Vllm2 #27

Uh oh!

darrenearl commented Feb 20, 2025

Uh oh!

darrenearl commented Feb 20, 2025

Uh oh!

Jopyth left a comment •

edited

Loading

Uh oh!

Jopyth Mar 10, 2025

Uh oh!

Jopyth Mar 10, 2025

Uh oh!

Jopyth Mar 10, 2025

Uh oh!

Jopyth Mar 10, 2025

Uh oh!

Jopyth Mar 10, 2025

Uh oh!

Jopyth Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		from lm_eval import evaluator

		from vllm.model_executor.layers.logits_processor import _apply_logits_processors

Vllm2 #27

Are you sure you want to change the base?

Vllm2 #27

Uh oh!

Conversation

darrenearl commented Feb 20, 2025

Uh oh!

darrenearl commented Feb 20, 2025

Uh oh!

Jopyth left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jopyth Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Jopyth Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Jopyth Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Jopyth Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Jopyth Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Jopyth Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Jopyth left a comment •

edited

Loading