You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues and none cover this bug.
Description
I built llama-cpp as described for GPU support and also ran the (slightly modified) command: "MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1"
to which the server starts but still show BLAS=0 and nvidia-smi shows no GPU usage.
I have a freshly installed cuda 12.6 with the correct driver and nvcc is also showing no problems.
I installed all the packages in a dedicated python 3.11 venv environment, but still it does not work.
I am starting the server with "PGPT_PROFILES=local make run", but Im not sure if this is correct, since documentation seem a bit ambiguous.
Steps to Reproduce
Build llama-cpp with cuda support
run MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1 in virtual environment with python 3.11
Expected Behavior
After starting server it should say BLAS=1 and GPU should be used
Actual Behavior
prints BLAS=0 and no GPU usage
Environment
Ubuntu 20.04 with python 3.11 venv
Additional Information
No response
Version
0.6.2
Setup Checklist
Confirm that you have followed the installation instructions in the project’s documentation.
Check that you are using the latest version of the project.
Verify disk space availability for model storage and data processing.
Ensure that you have the necessary permissions to run the project.
NVIDIA GPU Setup Checklist
Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
Ensure proper permissions are set for accessing GPU resources.
Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)
The text was updated successfully, but these errors were encountered:
Pre-check
Description
I built llama-cpp as described for GPU support and also ran the (slightly modified) command: "MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1"
to which the server starts but still show BLAS=0 and nvidia-smi shows no GPU usage.
I have a freshly installed cuda 12.6 with the correct driver and nvcc is also showing no problems.
I installed all the packages in a dedicated python 3.11 venv environment, but still it does not work.
I am starting the server with "PGPT_PROFILES=local make run", but Im not sure if this is correct, since documentation seem a bit ambiguous.
Steps to Reproduce
Build llama-cpp with cuda support
run MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1 in virtual environment with python 3.11
Expected Behavior
After starting server it should say BLAS=1 and GPU should be used
Actual Behavior
prints BLAS=0 and no GPU usage
Environment
Ubuntu 20.04 with python 3.11 venv
Additional Information
No response
Version
0.6.2
Setup Checklist
NVIDIA GPU Setup Checklist
nvidia-smi
to verify).sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
)The text was updated successfully, but these errors were encountered: