[BUG] BLAS=0 and no GPU usage when built with Llama-cpp with GPU support #2107

KianRK · 2024-10-18T23:37:58Z

Pre-check

I have searched the existing issues and none cover this bug.

Description

I built llama-cpp as described for GPU support and also ran the (slightly modified) command: "MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1"
to which the server starts but still show BLAS=0 and nvidia-smi shows no GPU usage.

I have a freshly installed cuda 12.6 with the correct driver and nvcc is also showing no problems.

I installed all the packages in a dedicated python 3.11 venv environment, but still it does not work.

I am starting the server with "PGPT_PROFILES=local make run", but Im not sure if this is correct, since documentation seem a bit ambiguous.

Steps to Reproduce

Build llama-cpp with cuda support
run MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1 in virtual environment with python 3.11

Expected Behavior

After starting server it should say BLAS=1 and GPU should be used

Actual Behavior

prints BLAS=0 and no GPU usage

Environment

Ubuntu 20.04 with python 3.11 venv

Additional Information

No response

Version

0.6.2

Setup Checklist

Confirm that you have followed the installation instructions in the project’s documentation.
Check that you are using the latest version of the project.
Verify disk space availability for model storage and data processing.
Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
Ensure proper permissions are set for accessing GPU resources.
Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)

The text was updated successfully, but these errors were encountered:

KianRK added the bug Something isn't working label Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] BLAS=0 and no GPU usage when built with Llama-cpp with GPU support #2107

[BUG] BLAS=0 and no GPU usage when built with Llama-cpp with GPU support #2107

KianRK commented Oct 18, 2024

[BUG] BLAS=0 and no GPU usage when built with Llama-cpp with GPU support #2107

[BUG] BLAS=0 and no GPU usage when built with Llama-cpp with GPU support #2107

Comments

KianRK commented Oct 18, 2024

Pre-check

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Additional Information

Version

Setup Checklist

NVIDIA GPU Setup Checklist