Pr-Whisper - Speech-to-Text for Emacs

A simple Emacs package that provides speech-to-text functionality using Whisper.cpp.

Record audio directly from Emacs and have it transcribed and inserted in current buffer at point. The package provides pr-whisper-mode, a global minor mode that starts audio recording and transcribes text when stopped. It also provides the pr-whisper-transcribe-file command to transcribe an already recorded WAV audio file.

I will eventually change this page in my fork. This is under construction.

More information on how to use this fork of the project is available in my PEL project inside the Writing Tools PDF. PEL can install the package automatically and creates bindings for the global commands.

Demo Video

Perfect Speech to Text in Emacs - See the package in action!

Features

Simple toggle interface: Single key (C-c .) to start/stop recording
Model selection: Choose from multiple Whisper models via customization
Vocabulary hints: Provide a custom vocabulary file to improve recognition of proper nouns and specialized terms (e.g., Greek names like Socrates, Alcibiades, Diotima)
Transcription history: Browse and re-insert previous transcriptions with M-x pr-whisper-insert-from-history
Automatic transcription using Whisper.cpp
Text insertion at cursor position
Async processing - Emacs remains responsive during transcription

Prerequisites

Before setting up this package, you need to install the following system dependencies:

1. Sox (for audio recording)

Ubuntu/Debian:

sudo apt install sox

macOS:

brew install sox

Arch Linux:

sudo pacman -S sox

2. Whisper.cpp

Clone and build Whisper.cpp:

# Clone the repository
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp

# Build the project
make

# Download models
# For fast mode (required)
bash ./models/download-ggml-model.sh base.en

# For accurate mode (required)
bash ./models/download-ggml-model.sh medium.en

Usage

This provides the pr-whisper-mode, a global minor mode that provides the ability to record audio and then process it with whisper.cpp and insert transcribed text in the current buffer at point.

⚠️ 🚧 🚧 🚧 Most of the text below is obsolete and needs to be updated.

Key Bindings

M-x pr-whisper-mode: Enable the global minor mode (does not start recording).
C-c .: Toggle recording on/off. When stopped, transcribes and inserts text at point.
M-x pr-whisper-mode: Disable the mode (stops any active recording).

Basic Workflow

Enable mode: M-x pr-whisper-mode
Start recording: Press C-c .
Stop and transcribe: Press C-c . again
Get results: The transcribed text is automatically inserted at your cursor position

Example

Open any text buffer in Emacs
Enable the mode: M-x pr-whisper-mode
Position your cursor where you want the transcribed text
Press C-c . to start recording
Speak into your microphone
Press C-c . when finished speaking
Wait a moment for transcription to complete
The text appears at your cursor position

Configuration

You can customize pr-whisper through Emacs' built-in customization interface or directly in your init.el.

Using Emacs Customize Interface

Run M-x customize-group RET pr-whisper RET to access all customization options:

pr-whisper-homedir: Directory where Whisper.cpp is installed (default: ~/whisper.cpp/)
pr-whisper-model: Which model to use by default
- ggml-base.en.bin - Fast mode (quick, good accuracy)
- ggml-medium.en.bin - Accurate mode (slower, better accuracy)
- Custom model filename
pr-whisper-vocabulary-file: Path to vocabulary hints file (default: ~/.emacs.d/whisper-vocabulary.txt)
pr-whisper-backend: Transcription backend - cli (default) or server
pr-whisper-server-port: Port for whisper-server (default: 8178)

Custom Configuration in init.el

;; Set custom Whisper.cpp installation directory
(setq pr-whisper-homedir "/usr/local/whisper.cpp/")

;; Choose default model (base.en or medium.en)
(setq pr-whisper-model "ggml-medium.en.bin")

;; Set custom vocabulary file location
(setq pr-whisper-vocabulary-file "~/Documents/vocabulary.txt")

;; Use server backend for faster transcription (~29% speedup)
;; Server starts during recording and warms up while you speak
(setq pr-whisper-backend 'server)

Custom Key Bindings

To change the toggle key binding, customize pr-whisper-key-for-toggle or set it before loading:

;; Use a different key binding for toggle
(setq pr-whisper-key-for-toggle (kbd "C-c s"))

Custom Vocabulary for Proper Nouns

To improve transcription accuracy for proper nouns, technical terms, or specialized vocabulary, create a vocabulary file at ~/.emacs.d/whisper-vocabulary.txt.

Example ~/.emacs.d/whisper-vocabulary.txt:

This transcription discusses classical Greek philosophy, including scholars and figures such as Thrasymachus, Socrates, Plato, Diotima, Alcibiades, and Phaedrus.

Custom vocabulary location:

(setq pr-whisper-vocabulary-file "~/Documents/vocabulary.txt")

For detailed guidance on vocabulary formats, tips, domain-specific examples, and managing multiple vocabularies, see VOCABULARY-GUIDE.md.

Troubleshooting

Common Issues

"sox: command not found"
- Install sox using your system package manager
"whisper-cli: command not found" or path errors
- Ensure Whisper.cpp is built and the path is correct
- Check that ~/whisper.cpp/build/bin/whisper-cli exists
- For server backend, check that ~/whisper.cpp/build/bin/whisper-server exists
- If installed elsewhere, customize pr-whisper-homedir to match your installation
- Run M-x customize-group RET pr-whisper RET to verify paths
No audio recorded
- Check your microphone permissions
- Test sox manually: sox -d -r 16000 -c 1 -b 16 test.wav
Transcription not working
- The package now validates paths on startup and will show clear error messages
- Verify the model files exist in ~/whisper.cpp/models/
- Run M-x pr-whisper-transcribe-fast to see validation errors
- Test whisper-cli manually with a wav file

Testing the Setup

Test each component individually:

# Test sox recording (record 5 seconds)
sox -d -r 16000 -c 1 -b 16 test.wav trim 0 5

# Test whisper transcription (fast mode)
~/whisper.cpp/build/bin/whisper-cli -m ~/whisper.cpp/models/ggml-base.en.bin -f test.wav

# Test whisper transcription (accurate mode)
~/whisper.cpp/build/bin/whisper-cli -m ~/whisper.cpp/models/ggml-medium.en.bin -f test.wav

How It Works

Recording: Uses sox to record audio at 16kHz, mono, 16-bit
Processing: Calls whisper-cli with the recorded audio file
Integration: Captures the output and inserts it into your Emacs buffer
Cleanup: Automatically cleans up temporary files and buffers

License

This project is released under the MIT License.

Development

Running Tests

emacs -Q --batch -L . -l pr-whisper-test.el -f ert-run-tests-batch-and-exit

Byte Compilation

./build

Contributing

Feel free to submit issues and pull requests to improve this package.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.history		.history
research		research
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
VOCABULARY-GUIDE.md		VOCABULARY-GUIDE.md
build		build
pr-whisper-reflow.el		pr-whisper-reflow.el
pr-whisper-server.el		pr-whisper-server.el
pr-whisper-test.el		pr-whisper-test.el
pr-whisper.el		pr-whisper.el
sample-whisper-vocabulary.txt		sample-whisper-vocabulary.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pr-Whisper - Speech-to-Text for Emacs

Demo Video

Features

Prerequisites

1. Sox (for audio recording)

2. Whisper.cpp

Usage

Key Bindings

Basic Workflow

Example

Configuration

Using Emacs Customize Interface

Custom Configuration in init.el

Custom Key Bindings

Custom Vocabulary for Proper Nouns

Troubleshooting

Common Issues

Testing the Setup

How It Works

License

Development

Running Tests

Byte Compilation

Contributing

About

Uh oh!

Releases

Packages

Languages

License

pierre-rouleau/pr-whisper

Folders and files

Latest commit

History

Repository files navigation

Pr-Whisper - Speech-to-Text for Emacs

Demo Video

Features

Prerequisites

1. Sox (for audio recording)

2. Whisper.cpp

Usage

Key Bindings

Basic Workflow

Example

Configuration

Using Emacs Customize Interface

Custom Configuration in init.el

Custom Key Bindings

Custom Vocabulary for Proper Nouns

Troubleshooting

Common Issues

Testing the Setup

How It Works

License

Development

Running Tests

Byte Compilation

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages