Skip to content
Change the repository type filter

All

    Repositories list

    • data-juicer

      Public
      Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
      Python
      3135.7k4222Updated Jan 8, 2026Jan 8, 2026
    • data-juicer-agents

      Public
      🤖 Your Intelligent Copilot for Data Exploration and Processing Pipeline
      Python
      3500Updated Jan 8, 2026Jan 8, 2026
    • data-juicer-sphinx

      Public
      1000Updated Jan 6, 2026Jan 6, 2026
    • data-juicer-sandbox

      Public
      A Feedback-Driven Suite for Multimodal Data-Model Co-development.
      Python
      2400Updated Dec 16, 2025Dec 16, 2025
    • data-juicer-hub

      Public
      Community-driven data-juicer recipes and best practices for various pre-training/fine-tuning tasks.
      2300Updated Dec 15, 2025Dec 15, 2025
    • transformers-stream-generator

      Public
      This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/Transformers. Self-modified version.
      Python
      19000Updated Nov 15, 2025Nov 15, 2025
    • recognize-anything

      Public
      Open-source and strong foundation image recognition models. Self-modified version.
      Jupyter Notebook
      316000Updated Nov 15, 2025Nov 15, 2025
    • .github

      Public
      0000Updated Nov 5, 2025Nov 5, 2025