Skip to content
@NVIDIA

NVIDIA Corporation

Pinned Loading

  1. cuopt Public

    NVIDIA cuOpt is an open-source GPU-accelerated optimization engine delivering near real-time solutions for complex decision-making challenges.

    Cuda 315 49

  2. cuopt-examples Public

    NVIDIA cuOpt examples for decision optimization

    Jupyter Notebook 343 47

  3. open-gpu-kernel-modules Public

    NVIDIA Linux open GPU kernel module source

    C 16k 1.4k

  4. aistore Public

    AIStore: scalable storage for AI applications

    Go 1.6k 214

  5. nvidia-container-toolkit Public

    Build and run containers leveraging NVIDIA GPUs

    Go 3.5k 379

  6. GenerativeAIExamples Public

    Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

    Jupyter Notebook 3.3k 790

Repositories

Showing 10 of 584 repositories
  • aistore Public

    AIStore: scalable storage for AI applications

    Go 1,563 MIT 214 0 1 Updated Jul 24, 2025
  • cuda-quantum Public

    C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

    C++ 750 264 372 (18 issues need help) 70 Updated Jul 24, 2025
  • Fuser Public

    A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

    C++ 341 61 198 (11 issues need help) 171 Updated Jul 24, 2025
  • stdexec Public

    `std::execution`, the proposed C++ framework for asynchronous and parallel programming.

    C++ 1,961 Apache-2.0 193 102 8 Updated Jul 24, 2025
  • NeMo Public

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

    Python 15,184 Apache-2.0 3,008 55 110 Updated Jul 24, 2025
  • cuCollections Public
    C++ 555 Apache-2.0 97 58 (1 issue needs help) 23 Updated Jul 24, 2025
  • cccl Public

    CUDA Core Compute Libraries

    C++ 1,801 242 1,008 (6 issues need help) 160 Updated Jul 24, 2025
  • TensorRT-LLM Public

    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.

    C++ 11,105 Apache-2.0 1,612 702 348 Updated Jul 24, 2025
  • warp Public

    A Python framework for accelerated simulation, data generation and spatial computing.

    Python 5,332 Apache-2.0 337 223 8 Updated Jul 24, 2025
  • TransformerEngine Public

    A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

    Python 2,571 Apache-2.0 460 204 70 Updated Jul 24, 2025