Dictype

Real-time voice-to-text input on Linux.

Features

Fcitx integration: customizable trigger keys for your profiles.
Real-time dictation: no need to wait for a connection before you speak, with real-time preview as the model revises.
Model options: paraformer-realtime-v2 (Alibaba Cloud), qwen3-asr-flash-realtime (Alibaba Cloud).

Setup

Install packages for your distro.
Arch Linux

Install packages from AUR:
- dictype-fcitx
- dictype

Configure Dictype.

# This is the configuration file for Dictype.
# Put it at `~/.config/dictype.toml`.

[PulseAudio]
# Use the following command to get a list of available `source_name`.
# $ pactl --format json list sources \
#   | jq '[
#     .[]
#     | select((.monitor_of_sink == null) and (.name | endswith(".monitor") | not))
#     | {
#         source_name: .["name"],
#         properties: {
#            device_name: .properties["device.name"],
#            device_alias: .properties["device.alias"],
#            device_description: .properties["device.description"]
#         }
#     }
#   ]'
preferred_source_name = "..." # optional

# You can have up to 5 profiles at the same time, starting with Profile1.
# Each profile may have different formats depending on the model (Backend).
[Profiles.Profile1]
Backend = "ParaformerV2"
Config = {
    dashscope_api_key = "...",                   # required
    dashscope_websocket_url = "wss://dashscope.aliyuncs.com/api-ws/v1/inference", # optional
    disfluency_removal_enabled = true,           # optional
    language_hints = ["zh"],                     # optional
    semantic_punctuation_enabled = false,        # optional
    max_sentence_silence = 800,                  # optional
    multi_threshold_mode_enabled = false,        # optional
    punctuation_prediction_enabled = true,       # optional
    inverse_text_normalization_enabled = true,   # optional
}

[Profiles.Profile2]
Backend = "QwenV3"
Config = {
    dashscope_api_key = "...",                                       # required
    dashscope_websocket_url = "wss://dashscope.aliyuncs.com/api-ws/v1/realtime?model=qwen3-asr-flash-realtime", # optional
    language = "en",                                                 # optional
    turn_detection = { threshold = 0.2, silence_duration_ms = 900 }, # optional
}

Run daemon
```
systemctl --user enable dictyped --now
```
Restart Fcitx.

Restarting Fcitx can be complex depending on your setup. The easist way to do this is just restart your computer.
Configure dictype-fcitx trigger keys using the official Fcitx configuration, under Configuration addons...
Focus on your text input, then press the profile trigger key to start. Press it again to stop. You may lose focus while transcribing.

Requirements

PulseAudio, or PipeWire with pulseaudio compatibility support.
fcitx5.
cloud accounts for respective models (currently supports two models on Alibaba Cloud).

TODOs

GUI configuration tool
local inference

Disclaimer

This is a personal project and is not affiliated with any cloud providers or model providers.
Discretion is advised when it comes to model fees and privacy concerns when using cloud models.

License

MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
.zed		.zed
DictypeFcitx		DictypeFcitx
assets		assets
crates		crates
infra		infra
proto		proto
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.gitignore		.gitignore
.lefthook.yaml		.lefthook.yaml
AGENTS.md		AGENTS.md
CMakeLists.txt		CMakeLists.txt
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dictype

Features

Setup

Requirements

TODOs

Disclaimer

License

About

Uh oh!

Releases 8

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dictype

Features

Setup

Requirements

TODOs

Disclaimer

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 8

Contributors

Uh oh!

Languages