Skip to content
Change the repository type filter

All

    Repositories list

    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      17k17038Updated May 30, 2026May 30, 2026
    • Tool to scrape benchmarks used most commonly in recent popular open source models
      Python
      MIT License
      1000Updated May 29, 2026May 29, 2026
    • A framework for efficient model inference with omni-modality models
      Python
      Apache License 2.0
      1k203Updated May 28, 2026May 28, 2026
    • TokenSpeed is a speed-of-light LLM inference engine.
      Python
      MIT License
      133001Updated May 28, 2026May 28, 2026
    • A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
      Python
      Apache License 2.0
      94000Updated May 26, 2026May 26, 2026
    • 1214Updated May 25, 2026May 25, 2026
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      Apache License 2.0
      1.4k006Updated May 24, 2026May 24, 2026
    • Go
      Apache License 2.0
      0001Updated May 21, 2026May 21, 2026
    • Every Eval Ever is a shared schema and crowdsourced eval database. It defines a standardized metadata format for storing AI evaluation results — from leaderboar…
      Python
      MIT License
      350014Updated May 19, 2026May 19, 2026
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      3.3k601Updated May 15, 2026May 15, 2026
    • A Python library for guardrail models evaluation with vLLM support.
      Python
      European Union Public License 1.2
      110016Updated May 11, 2026May 11, 2026
    • gorilla

      Public
      Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
      Python
      Apache License 2.0
      1.4k004Updated May 9, 2026May 9, 2026
    • research

      Public
      Repository to enable research flows
      Python
      0303Updated May 6, 2026May 6, 2026
    • Neural Magic GHA
      Python
      Apache License 2.0
      0005Updated Apr 20, 2026Apr 20, 2026
    • lighteval

      Public
      Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
      Python
      MIT License
      475001Updated Apr 18, 2026Apr 18, 2026
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      2.8k000Updated Apr 16, 2026Apr 16, 2026
    • 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference an…
      Python
      Apache License 2.0
      33k000Updated Apr 15, 2026Apr 15, 2026
    • Beam search scheduler plugin for vLLM v1 with CoW block table forking
      Python
      0000Updated Apr 9, 2026Apr 9, 2026
    • SWE-bench

      Public
      SWE-bench: Can Language Models Resolve Real-world Github Issues?
      Python
      MIT License
      874000Updated Apr 8, 2026Apr 8, 2026
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      Apache License 2.0
      6.2k103Updated Apr 8, 2026Apr 8, 2026
    • lmms-eval

      Public
      Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
      Python
      Other
      5940012Updated Mar 12, 2026Mar 12, 2026
    • DeepEP

      Public
      DeepEP: an efficient expert-parallel communication library
      Cuda
      MIT License
      1.3k100Updated Mar 11, 2026Mar 11, 2026
    • The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-b…
      Python
      MIT License
      649000Updated Mar 10, 2026Mar 10, 2026
    • vllm-fork

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      17k001Updated Mar 5, 2026Mar 5, 2026
    • TPU inference for vLLM, with unified JAX and PyTorch support.
      Python
      Apache License 2.0
      201000Updated Mar 5, 2026Mar 5, 2026
    • Arena-Hard-Auto: An automatic LLM benchmark.
      Python
      Apache License 2.0
      152003Updated Mar 3, 2026Mar 3, 2026
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      Other
      28k106Updated Feb 11, 2026Feb 11, 2026
    • nm-vllm

      Public archive
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Other
      17k26600Updated Dec 4, 2025Dec 4, 2025
    • Python
      1001Updated Nov 13, 2025Nov 13, 2025
    • Open Data Hub operator to manage ODH component integrations
      Go
      Apache License 2.0
      271000Updated Nov 12, 2025Nov 12, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.