-
Notifications
You must be signed in to change notification settings - Fork 19
Vllm2 #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Vllm2 #27
Conversation
… to be optimized.
Feature/chat cli
Added ppl and few-shot evaluation scripts.
Feature/evaluation
…eenBitAI/green-bit-llm into feature/langchain_integration
…eenBitAI/green-bit-llm into feature/langchain_integration
…eenBitAI/green-bit-llm into feature/langchain_integration
Feature/langchain integration
I have some questions about code specifications, such as API names and some parameters, as well as which feature points need to be improved. Can you give me some suggestions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to rethink some parts about the installation process as a whole, but in the meantime I think we should add some brief documentation about the installation process if VLLM is to be used. (Install the corresponding requirements.txt files after installation of green-bit-llm?) Maybe add a README to third_party/vllm or add a section to the main readme about this.
Please also always add a newline to the end of the file.
| python -m green_bit_llm.evaluation.evaluate --model GreenBitAI/Qwen-1.5-4B-layer-mix-bpw-3.0 --trust-remote-code --eval-ppl --ppl-tasks wikitext2,c4,ptb | ||
| python -m green_bit_llm.evaluation.evaluate --model GreenBitAI/Qwen-1.5-4B-layer-mix-bpw-3.0 --backend greenbit-engine --trust-remote-code --eval-ppl --ppl-tasks wikitext2,c4,ptb | ||
| ``` | ||
| or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some explanation is needed. We could add some information here on the significance of this choice. I.e. what is VLLM, why/when to use it, or some link. Or all of those.
|
|
||
| from lm_eval import evaluator | ||
|
|
||
| from vllm.model_executor.layers.logits_processor import _apply_logits_processors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there is no "safe" alternativ instead of using this internal method, right? If so, I think we can use it.
| lm_head, hidden_states, sampling_metadata, *embedding_bias = input | ||
| embedding_bias = embedding_bias[0] if embedding_bias else None | ||
| logits = module._get_logits(hidden_states, lm_head, embedding_bias) | ||
| if logits is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use a guard clause here.
| shift_labels = testenc[:, (i * seqlen): ((i + 1) * seqlen)][ | ||
| :, 1: | ||
| ].to(device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please adjust the styling (see https://peps.python.org/pep-0008/#indentation or follow black style).
| DEFAULT_SEQLEN = 2048 | ||
| DEFAULT_RANDOM_SEED = 0 | ||
| DTYPE = torch.half | ||
| DEFAULT_MODEL_BCKEND = ["vllm", "greenbit-engine"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name seems it should be AVAILABLE_MODEL_BACKENDS. It could be expected that the first option is the default, or we also declare the default model backend separately.
| class GBLLMInferenceBackend(BaseInferenceBackend): | ||
| def __init__(self, model_path, **kwargs): | ||
| # Building configs | ||
| tokenizer_config = {"trust_remote_code": True if kwargs.get("trust_remote_code") else None} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use kwargs.get("trust_remote_code", None) to set a default. Similar below for the same and other arguments.
About merge the vllm eval, I need some help.