-
Notifications
You must be signed in to change notification settings - Fork 114
config the vLLM engineArgs in config.pbtxt #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) * Initial Commit * Mount model repo so changes reflect, parameter tweaking, README file * Image name error * Incorporating review comments. Separate docker and model repo builds, add README, restructure repo * Tutorial restructuring. Using static model configurations * Bump triton container and update README * Remove client script * Incorporating review comments * Modify WIP line in vLLM tutorial * Remove trust_remote_code parameter from falcon model * Removing Mistral * Incorporating Feedback * Change input/output names * Pre-commit format * Different perf_analyzer example, config file format fixes * Deep dive changes to Triton tools section * Remove unused variable
Added Llama2 tutorial for TensorRT-LLM backend
…ference-server#65) * Updated vLLM tutorial's README to use vllm container --------- Co-authored-by: dyastremsky <[email protected]>
Hi @activezhao , may I kindly ask you to re-base your PR on top of the main branch and send us a CLA: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla |
Hi @oandreeva-nv OK, I will do it. But, I find that the structure of Quick_Deploy/vLLM has changed a lot, will this pr still be OK? |
# Conflicts: # Quick_Deploy/vLLM/config.pbtxt
Hi @oandreeva-nv Because the Could you please close this PR and do CR in the new PR? Thanks. |
Closing this PR per @activezhao request |
We get vLLm engineArgs from vllm_engine_args.json before, now, we can get them from config.pbtxt.