CV-CUDA is an open-source library of GPU-accelerated computer vision algorithms designed for speed and scalability. It delivers high-throughput, low-latency image/video processing for AI pipelines across NVIDIA cloud, desktop, and edge platforms. CV-CUDA is built for performance and works seamlessly with C/C++ and Python Image and AI frameworks.
For more information on available operators, API documentation, and getting started guides, refer to our online documentation.
Fully GPU-accelerated image resizing with nvImageCodec and CV-CUDA.
import cvcuda
from nvidia import nvimgcodec
# Decode image directly to GPU
decoder = nvimgcodec.Decoder()
image = decoder.read("input.jpg")
# Convert to CV-CUDA tensor and process
cvcuda_tensor = cvcuda.as_tensor(image, "HWC")
resized = cvcuda.resize(cvcuda_tensor, (224, 224, 3), cvcuda.Interp.LINEAR)CV-CUDA can be installed from pre-built packages (Python wheels, Debian packages, or tar archives) or built from source. We provide pre-built Python wheels on pypi.org for a variety of Python versions (3.9 to 3.14) and Linux-based platforms (x86_64 and aarch64). See cvcuda-cu12 and cvcuda-cu13 for CUDA 12 and CUDA 13, respectively.
| CUDA Version | Installation Command |
|---|---|
| CUDA 12 | pip install cvcuda-cu12 |
| CUDA 13 | pip install cvcuda-cu13 |
See Installation for complete installation instructions including building from source, installing Debian packages, and tar archives.
| CV-CUDA Build | Platform | CUDA Version | CUDA Compute Capability | Hardware Architectures | Nvidia Driver | Python Versions | Supported Compilers (build from source and API compatiblity) | API compatibility with prebuilt binaries | OS/Linux distributions tested with prebuilt packages |
|---|---|---|---|---|---|---|---|---|---|
| x86_64_cu12 | x86_64 | ≥12.2 | ≥SM7.5 | Turing, Ampere, Ada Lovelace, Hopper, Blackwell | ≥r525** | 3.9 - 3.14 | gcc≥10* | gcc≥10 clang≥11 |
ManyLinux2014-compliant, Ubuntu≥22.04 WSL2/Ubuntu≥22.04 |
| x86_64_cu13 | x86_64 | ≥13.0 | ≥SM7.5 | Turing, Ampere, Ada Lovelace, Hopper, Blackwell | ≥r580** | 3.9 - 3.14 | gcc≥10* | gcc≥10 clang≥11 |
ManyLinux2014-compliant, Ubuntu≥22.04 WSL2/Ubuntu≥22.04 |
| aarch64_cu12 | aarch64 SBSA*** | ≥12.2 | ≥SM7.5 | ARM SBSA (incl. Grace): Volta, Turing, Ampere, Ada Lovelace, Hopper, Blackwell | ≥r525** | 3.9 - 3.14 | gcc≥10* | gcc≥10 clang≥11 |
ManyLinux2014-compliant, Ubuntu≥22.04 |
| aarch64_cu12 | aarch64 Jetson*** | 12.2 | ≥SM7.5 | Jetson AGX Orin, IGX Orin + Ampere RTX6000, IGX Orin + ADA RTX6000 | JetPack 6.0 DP, r535 (IGX OS v0.6) | 3.10 | gcc≥10* | gcc≥10 clang≥11 |
Jetson Linux 36.2 IGX OS v0.6 |
| aarch64_cu13 | aarch64 SBSA and Jetson Thor*** | ≥13.0 | ≥SM7.5 | ARM SBSA (incl. Grace): Volta, Turing, Ampere, Ada Lovelace, Hopper, Blackwell, Jetson Thor | ≥r580** | 3.9 - 3.14 | gcc≥10* | gcc≥10 clang≥11 |
ManyLinux2014-compliant, Ubuntu≥22.04 |
* test module with partial coverage, need gcc≥11 for full coverage (see Known Limitations)
** samples require driver ≥r535 to run. CUDA 13 requires ≥r580.
*** starting with v0.14, aarch64 packages (deb, tar.xz or wheels) distributed on Github (release "assets") or Pypi are SBSA-compatible unless noted otherwise. Jetson 6 builds (deb, tar.xz, whl) can be found in explicitly named "Jetson" archives in Github release assets. Packages marked 'aarch64_cu13' are built with the unified CUDA toolkit, compatible with both server-class and embedded platforms (Jetson Thor).
- CV-CUDA does not currently support native Windows, only WSL2
- Starting with v0.16, CV-CUDA is dropping official support for CUDA 11, SM7 (Volta), Ubuntu 20.04, Python 3.8.
- Starting with v0.14, aarch64 packages (deb, tar.xz or wheels) distributed on Github (release "assets") and Pypi are the SBSA-compatible ones. Jetson builds (deb, tar.xz, whl) can be found in explicitly named "Jetson" archives in Github release assets.
- The C++ test module builds with gcc≥10 with partial coverage. Full coverage requires gcc≥11 with full C++20 support (NTTP).
- CV-CUDA Samples require driver ≥r535 to run and are only officially supported with CUDA 12.
- Only one CUDA version (CUDA 12.x or CUDA 13.x) of CV-CUDA packages (Debian packages, tarballs, Python Wheels) can be installed at a time. Please uninstall all packages from a given CUDA version before installing packages from a different version.
- The Resize and RandomResizedCrop operators incorrectly interpolate pixel values near the boundary of an image or tensor when using cubic interpolation. This will be fixed in an upcoming release.
- The OSD operator's text rendering functionality has known issues on Jetson/aarch64 platforms, to be fixed in an upcoming release.
CV-CUDA is an open source project. As part of the Open Source Community, we are committed to the cycle of learning, improving, and updating that makes this community thrive. However, CV-CUDA is not yet ready for external contributions.
To report a bug, request a new feature, or ask a general question, please file a GitHub issue.
To understand our commitment to the Open Source Community, and providing an environment that both supports and respects the efforts of all contributors, please read our Code of Conduct.
CV-CUDA operates under the Apache-2.0 license.
CV-CUDA, as a NVIDIA program, is committed to secure development practices. Please read our Security page to learn more.
CV-CUDA originated as a collaboration between NVIDIA and ByteDance.
- CV-CUDA Online Documentation
- Optimizing Microsoft Bing Visual Search with NVIDIA Accelerated Libraries
- Accelerating AI Pipelines: Boosting Visual Search Efficiency, GTC 2025
- Optimize Short-Form Video Processing Toward the Speed of Light, GTC 2025
- CV-CUDA Increasing Throughput and Reducing Costs for AI-Based Computer Vision with CV-CUDA
- NVIDIA Announces Microsoft, Tencent, Baidu Adopting CV-CUDA for Computer Vision AI
- CV-CUDA helps Tencent Cloud audio and video PaaS platform achieve full-process GPU acceleration for video enhancement AI