Skip to content
Change the repository type filter

All

    Repositories list

    • salmon

      Public
      The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)
      Python
      14810Updated Aug 15, 2025Aug 15, 2025
    • WhiStress

      Public
      The official repo of "WhiStress: Enriching Transcriptions with Sentence Stress Detection" (Interspeech 2025)
      Python
      113200Updated Jul 24, 2025Jul 24, 2025
    • The official repo of the paper "StressTest: Can YOUR Speech LM Handle the Stress?"
      Python
      02000Updated Jul 9, 2025Jul 9, 2025
    • PAST

      Public
      Python
      64510Updated Jul 7, 2025Jul 7, 2025
    • HebTTS

      Public
      The official implementation of "A Language Modeling Approach to Diacritic-Free Hebrew TTS"
      Python
      1410360Updated Jun 12, 2025Jun 12, 2025
    • slamkit

      Public
      SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"
      Python
      1322831Updated May 18, 2025May 18, 2025
    • The official repo of the COLM 2024 paper: The Larger the Better? Improved LLM Code-Generation via Budget Reallocation
      0800Updated May 7, 2025May 7, 2025
    • aero

      Public
      This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)
      Python
      34235142Updated May 1, 2025May 1, 2025
    • This repo is a fork, containing the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
      Python
      15100Updated Sep 28, 2023Sep 28, 2023
    • A spoken version of the textual story cloze benchmark
      12010Updated Aug 6, 2023Aug 6, 2023
    • This repo is a fork from the official PyTorch implementation of "AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation" (Interspeech 2023)
      Python
      6500Updated Jun 25, 2023Jun 25, 2023
    • DISSC

      Public
      This is a from from the official repository of "Speaking Style Conversion With Discrete Self-Supervised Units"
      Python
      9100Updated Jan 20, 2023Jan 20, 2023
    • im2wav

      Public
      This is a fork from the official implementation of the pipeline presented in "I hear your true colors: Image Guided Audio Generation" (ICASSP 2023)
      Python
      15000Updated Jan 18, 2023Jan 18, 2023
    • This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language Modeling" (ICASSP 2023)
      Python
      22000Updated Jan 3, 2023Jan 3, 2023
    • SC-PhASE

      Public
      This repo contains the official PyTorch implementation of "A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement" (Interspeech 2022)
      Python
      22800Updated Aug 8, 2022Aug 8, 2022
    • DSVAE-NES

      Public
      This is a fork from the official PyTorch implementation of the paper: "Learning Discrete Structured VAE using NES" (ICLR 2022)
      Python
      4000Updated May 3, 2022May 3, 2022