Skip to content

Commit 86f1d22

Browse files
authored
Merge pull request #15 from BuildWithAIs/feat/low-volume-enhancement-rebase
feat(audio): add low-volume enhancement mode
2 parents 463272d + 2aae04d commit 86f1d22

20 files changed

Lines changed: 276 additions & 66 deletions

electron/main/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Electron 主进程目录,负责窗口管理、IPC、录音流程、ASR/LLM 调
44

55
## 文件列表
66

7-
- `main.ts` - 应用入口;创建后台/设置/浮窗窗口、托盘菜单与 IPC 处理,协调 PTT 录音 → 转录 → LLM 润色 → 注入流程、会话取消与 FFmpeg 初始化(系统语言随窗口聚焦同步)。
7+
- `main.ts` - 应用入口;创建后台/设置/浮窗窗口、托盘菜单与 IPC 处理,协调 PTT 录音 → 转录 → LLM 润色 → 注入流程、会话取消与 FFmpeg 初始化,并向音频处理器注入 ASR/低音量模式配置(系统语言随窗口聚焦同步)。
88
- `i18n.ts` - 主进程 i18next 初始化与语言切换,广播语言快照到各窗口。
99
- `env.ts` - 运行时环境与资源路径解析(开发/生产)。
1010
- `config-manager.ts` - 使用 `electron-store` 持久化应用偏好、ASR 配置、LLM 润色开关与快捷键配置。

electron/main/__tests__/config-manager.test.ts

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,4 +111,23 @@ describe('ConfigManager', () => {
111111
enabled: false,
112112
})
113113
})
114+
115+
it('enables low volume mode by default for new installs', async () => {
116+
const configManager = await createManager()
117+
expect(configManager.getASRConfig().lowVolumeMode).toBe(true)
118+
})
119+
120+
it('migrates existing users to low volume mode disabled', async () => {
121+
seedData = {
122+
asr: {
123+
provider: 'glm',
124+
region: 'cn',
125+
apiKeys: { cn: 'legacy', intl: '' },
126+
endpoint: '',
127+
language: 'auto',
128+
},
129+
}
130+
const configManager = await createManager()
131+
expect(configManager.getASRConfig().lowVolumeMode).toBe(false)
132+
})
114133
})

electron/main/__tests__/main.test.ts

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,9 @@ const mockSendLanguageSnapshotToWindow = vi.hoisted(() => vi.fn())
4242
const mockT = vi.hoisted(() => vi.fn((key: string) => `t:${key}`))
4343

4444
const mockGetAppConfig = vi.hoisted(() => vi.fn(() => ({ language: 'en', autoLaunch: true })))
45-
const mockGetASRConfig = vi.hoisted(() => vi.fn(() => ({ region: 'cn', apiKeys: {} })))
45+
const mockGetASRConfig = vi.hoisted(() =>
46+
vi.fn(() => ({ provider: 'glm', region: 'cn', apiKeys: {}, lowVolumeMode: true })),
47+
)
4648
const mockGetLLMRefineConfig = vi.hoisted(() =>
4749
vi.fn(() => ({
4850
enabled: true,
@@ -231,7 +233,14 @@ describe('main startup', () => {
231233
openAtLogin: true,
232234
openAsHidden: true,
233235
})
234-
expect(mockASRProviderCtor).toHaveBeenCalledWith({ region: 'cn', apiKeys: {} })
236+
expect(mockGetASRConfig).toHaveBeenCalled()
237+
expect(mockASRProviderCtor).toHaveBeenCalledWith(
238+
expect.objectContaining({
239+
region: 'cn',
240+
apiKeys: {},
241+
lowVolumeMode: true,
242+
}),
243+
)
235244
expect(mockLLMProviderCtor).toHaveBeenCalledWith(
236245
{
237246
enabled: true,
@@ -245,6 +254,7 @@ describe('main startup', () => {
245254
expect(mockInitProcessor).toHaveBeenCalledWith(
246255
expect.objectContaining({
247256
getAsrProvider: expect.any(Function),
257+
getASRConfig: expect.any(Function),
248258
initializeASRProvider: expect.any(Function),
249259
getLlmProvider: expect.any(Function),
250260
initializeLLMProvider: expect.any(Function),

electron/main/audio/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,6 @@
66

77
- `index.ts` - 音频模块统一导出。
88
- `session-manager.ts` - 会话生命周期管理(开始/停止/取消)与 HUD 状态更新。
9-
- `processor.ts` - 音频处理流水线(保存、转码、ASR、LLM 润色、写历史、注入、清理,润色失败时回退原文)。
10-
- `converter.ts` - FFmpeg 初始化与 WebM → MP3 转换。
9+
- `processor.ts` - 音频处理流水线(保存、按低音量模式选择增益转码、ASR、LLM 润色、写历史、注入、清理,润色失败时回退原文)。
10+
- `converter.ts` - FFmpeg 初始化与 WebM → MP3 转换(支持可选 `gainDb` 音量增强)
1111
- `__tests__/` - 音频会话与处理流水线测试。

electron/main/audio/__tests__/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,5 @@
33
主进程音频模块测试。
44

55
- `session-manager.test.ts` - 录音会话开始/停止/取消与 HUD 交互。
6-
- `processor.test.ts` - 音频流水线(保存/转码/ASR/LLM 润色/注入/清理)与异常分支。
7-
- `converter.test.ts` - FFmpeg 初始化与音频格式转换成功/失败
6+
- `processor.test.ts` - 音频流水线(保存/转码/低音量模式增益/ASR/LLM 润色/注入/清理)与异常分支。
7+
- `converter.test.ts` - FFmpeg 初始化、音频格式转换与可选增益滤镜分支

electron/main/audio/__tests__/converter.test.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ let lastCommand: {
99
toFormat: ReturnType<typeof vi.fn>
1010
audioCodec: ReturnType<typeof vi.fn>
1111
audioBitrate: ReturnType<typeof vi.fn>
12+
audioFilters: ReturnType<typeof vi.fn>
1213
on: ReturnType<typeof vi.fn>
1314
save: ReturnType<typeof vi.fn>
1415
}
@@ -21,6 +22,7 @@ const createCommand = () => {
2122
toFormat: vi.fn(() => chain),
2223
audioCodec: vi.fn(() => chain),
2324
audioBitrate: vi.fn(() => chain),
25+
audioFilters: vi.fn(() => chain),
2426
on: vi.fn((event: string, handler: (err?: Error) => void) => {
2527
handlers[event] = handler
2628
return chain
@@ -97,6 +99,16 @@ describe('audio converter', () => {
9799

98100
await expect(convertToMP3('/input.webm', '/output.mp3')).resolves.toBeUndefined()
99101
expect(lastCommand?.chain.save).toHaveBeenCalledWith('/output.mp3')
102+
expect(lastCommand?.chain.audioFilters).not.toHaveBeenCalled()
103+
})
104+
105+
it('applies gain filter when gainDb is provided', async () => {
106+
const { convertToMP3 } = await loadConverter()
107+
108+
await expect(
109+
convertToMP3('/input.webm', '/output.mp3', { gainDb: 10 }),
110+
).resolves.toBeUndefined()
111+
expect(lastCommand?.chain.audioFilters).toHaveBeenCalledWith('volume=10dB')
100112
})
101113

102114
it('rejects when ffmpeg conversion fails', async () => {

electron/main/audio/__tests__/processor.test.ts

Lines changed: 81 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ describe('audio processor', () => {
7777
const { initProcessor, handleAudioData } = await loadProcessor()
7878
initProcessor({
7979
getAsrProvider: () => null,
80+
getASRConfig: () => ({ provider: 'glm', region: 'cn', apiKeys: { cn: '', intl: '' } }),
8081
initializeASRProvider: vi.fn(),
8182
})
8283

@@ -105,6 +106,12 @@ describe('audio processor', () => {
105106
const getAsrProvider = vi.fn(() => ({ transcribe }) as any)
106107
initProcessor({
107108
getAsrProvider,
109+
getASRConfig: () => ({
110+
provider: 'glm',
111+
region: 'cn',
112+
apiKeys: { cn: '', intl: '' },
113+
lowVolumeMode: true,
114+
}),
108115
initializeASRProvider: vi.fn(),
109116
})
110117

@@ -115,7 +122,9 @@ describe('audio processor', () => {
115122
await vi.runAllTimersAsync()
116123
vi.useRealTimers()
117124

118-
expect(mockConvertToMP3).toHaveBeenCalled()
125+
expect(mockConvertToMP3).toHaveBeenCalledWith(expect.any(String), expect.any(String), {
126+
gainDb: 10,
127+
})
119128
expect(transcribe).toHaveBeenCalled()
120129
expect(mockUpdateSession).toHaveBeenCalledWith({
121130
transcription: 'hello world',
@@ -149,6 +158,7 @@ describe('audio processor', () => {
149158

150159
initProcessor({
151160
getAsrProvider: () => ({ transcribe }) as any,
161+
getASRConfig: () => ({ provider: 'glm', region: 'cn', apiKeys: { cn: '', intl: '' } }),
152162
initializeASRProvider: vi.fn(),
153163
})
154164

@@ -171,6 +181,7 @@ describe('audio processor', () => {
171181

172182
initProcessor({
173183
getAsrProvider: () => null,
184+
getASRConfig: () => ({ provider: 'glm', region: 'cn', apiKeys: { cn: '', intl: '' } }),
174185
initializeASRProvider: vi.fn(),
175186
})
176187

@@ -208,6 +219,7 @@ describe('audio processor', () => {
208219

209220
initProcessor({
210221
getAsrProvider: () => ({ transcribe }) as any,
222+
getASRConfig: () => ({ provider: 'glm', region: 'cn', apiKeys: { cn: '', intl: '' } }),
211223
initializeASRProvider: vi.fn(),
212224
getLlmProvider: () =>
213225
({
@@ -248,6 +260,7 @@ describe('audio processor', () => {
248260

249261
initProcessor({
250262
getAsrProvider: () => ({ transcribe }) as any,
263+
getASRConfig: () => ({ provider: 'glm', region: 'cn', apiKeys: { cn: '', intl: '' } }),
251264
initializeASRProvider: vi.fn(),
252265
getLlmProvider: () =>
253266
({
@@ -267,4 +280,71 @@ describe('audio processor', () => {
267280
})
268281
expect(mockInjectText).toHaveBeenCalledWith('raw text')
269282
})
283+
284+
it('does not apply gain when low volume mode is disabled', async () => {
285+
const { initProcessor, handleAudioData } = await loadProcessor()
286+
const session = {
287+
id: 'session-1',
288+
startTime: new Date(),
289+
status: 'recording',
290+
duration: 1200,
291+
}
292+
mockGetCurrentSession.mockReturnValue(session)
293+
294+
const transcribe = vi.fn().mockResolvedValue({
295+
text: 'hello world',
296+
id: 't-5',
297+
created: Date.now(),
298+
model: 'glm',
299+
})
300+
initProcessor({
301+
getAsrProvider: () => ({ transcribe }) as any,
302+
getASRConfig: () => ({
303+
provider: 'glm',
304+
region: 'cn',
305+
apiKeys: { cn: '', intl: '' },
306+
lowVolumeMode: false,
307+
}),
308+
initializeASRProvider: vi.fn(),
309+
})
310+
311+
await handleAudioData(Buffer.from('audio'))
312+
313+
expect(mockConvertToMP3).toHaveBeenCalledWith(expect.any(String), expect.any(String), {
314+
gainDb: undefined,
315+
})
316+
})
317+
318+
it('applies default gain when low volume mode is undefined', async () => {
319+
const { initProcessor, handleAudioData } = await loadProcessor()
320+
const session = {
321+
id: 'session-1',
322+
startTime: new Date(),
323+
status: 'recording',
324+
duration: 1200,
325+
}
326+
mockGetCurrentSession.mockReturnValue(session)
327+
328+
const transcribe = vi.fn().mockResolvedValue({
329+
text: 'hello world',
330+
id: 't-6',
331+
created: Date.now(),
332+
model: 'glm',
333+
})
334+
initProcessor({
335+
getAsrProvider: () => ({ transcribe }) as any,
336+
getASRConfig: () => ({
337+
provider: 'glm',
338+
region: 'cn',
339+
apiKeys: { cn: '', intl: '' },
340+
}),
341+
initializeASRProvider: vi.fn(),
342+
})
343+
344+
await handleAudioData(Buffer.from('audio'))
345+
346+
expect(mockConvertToMP3).toHaveBeenCalledWith(expect.any(String), expect.any(String), {
347+
gainDb: 10,
348+
})
349+
})
270350
})

electron/main/audio/converter.ts

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ import { t } from '../i18n'
1717
let ffmpeg: any
1818
let ffmpegInitialized = false
1919

20+
export interface ConvertToMP3Options {
21+
gainDb?: number
22+
}
23+
2024
/**
2125
* 初始化 FFmpeg
2226
*
@@ -61,7 +65,11 @@ export function initializeFfmpeg(): void {
6165
* @returns Promise<void> - 转换完成时 resolve
6266
* @throws {Error} 转换失败时 reject
6367
*/
64-
export function convertToMP3(inputPath: string, outputPath: string): Promise<void> {
68+
export function convertToMP3(
69+
inputPath: string,
70+
outputPath: string,
71+
options?: ConvertToMP3Options,
72+
): Promise<void> {
6573
const conversionStartTime = Date.now()
6674
return new Promise((resolve, reject) => {
6775
// 确保 ffmpeg 已初始化
@@ -70,11 +78,17 @@ export function convertToMP3(inputPath: string, outputPath: string): Promise<voi
7078
console.log(`[Audio:Converter] Converting audio to MP3...`)
7179
console.log(`[Audio:Converter] Input: ${inputPath}`)
7280
console.log(`[Audio:Converter] Output: ${outputPath}`)
81+
if (typeof options?.gainDb === 'number') {
82+
console.log(`[Audio:Converter] Gain: +${options.gainDb}dB`)
83+
}
84+
85+
const command = ffmpeg(inputPath).toFormat('mp3').audioCodec('libmp3lame').audioBitrate('128k')
86+
87+
if (typeof options?.gainDb === 'number') {
88+
command.audioFilters(`volume=${options.gainDb}dB`)
89+
}
7390

74-
ffmpeg(inputPath)
75-
.toFormat('mp3')
76-
.audioCodec('libmp3lame')
77-
.audioBitrate('128k')
91+
command
7892
.on('end', () => {
7993
const duration = Date.now() - conversionStartTime
8094
console.log(`[Audio:Converter] ⏱️ Conversion completed in ${duration}ms`)

electron/main/audio/processor.ts

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
import { app } from 'electron'
1616
import fs from 'fs'
1717
import path from 'node:path'
18+
import { LOW_VOLUME_GAIN_DB } from '../../shared/constants'
1819
import { updateOverlay, hideOverlay } from '../window/overlay'
1920
import { t } from '../i18n'
2021
import { historyManager } from '../history-manager'
@@ -23,6 +24,7 @@ import { convertToMP3 } from './converter'
2324
import { getCurrentSession, updateSession, clearSession } from './session-manager'
2425
import type { ASRProvider } from '../asr-provider'
2526
import type { LLMProvider } from '../llm-provider'
27+
import type { ASRConfig } from '../../shared/types'
2628

2729
/**
2830
* 处理器外部依赖
@@ -31,6 +33,8 @@ import type { LLMProvider } from '../llm-provider'
3133
type ProcessorDeps = {
3234
/** 获取 ASR Provider 实例 */
3335
getAsrProvider: () => ASRProvider | null
36+
/** 获取当前 ASR 配置 */
37+
getASRConfig: () => ASRConfig
3438
/** 初始化 ASR Provider */
3539
initializeASRProvider: () => void
3640
/** 获取 LLM Provider 实例 */
@@ -87,8 +91,13 @@ export async function handleAudioData(buffer: Buffer): Promise<void> {
8791
console.log(`[Audio:Processor] ⏱️ File save: ${saveDuration}ms`)
8892

8993
// Step 2: 转换为 MP3
94+
const asrConfig = deps.getASRConfig()
95+
const lowVolumeModeEnabled = asrConfig.lowVolumeMode ?? true
96+
console.log(`[Audio:Processor] Low volume mode enabled: ${lowVolumeModeEnabled}`)
9097
const conversionStartTime = Date.now()
91-
await convertToMP3(tempWebmPath, tempMp3Path)
98+
await convertToMP3(tempWebmPath, tempMp3Path, {
99+
gainDb: lowVolumeModeEnabled ? LOW_VOLUME_GAIN_DB : undefined,
100+
})
92101
const conversionDuration = Date.now() - conversionStartTime
93102

94103
// 检查取消

electron/main/config-manager.ts

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ const defaultConfig: AppConfig = {
2929
cn: '',
3030
intl: '',
3131
},
32+
lowVolumeMode: true,
3233
// apiKey: '', // Deprecated, removed from default
3334
endpoint: '',
3435
language: 'auto',
@@ -67,6 +68,17 @@ export class ConfigManager {
6768
this.store.delete('asr.apiKey' as any) // 迁移后删除旧字段
6869
}
6970
}
71+
72+
// 低音量模式迁移策略:
73+
// - 新安装:defaultConfig 中已包含 lowVolumeMode=true,不处理
74+
// - 旧用户升级:若 asr 存在但无该字段,则显式写入 false
75+
if (
76+
asrConfig &&
77+
typeof asrConfig === 'object' &&
78+
!Object.prototype.hasOwnProperty.call(asrConfig, 'lowVolumeMode')
79+
) {
80+
this.store.set('asr.lowVolumeMode', false)
81+
}
7082
}
7183

7284
// 获取完整配置
@@ -101,6 +113,10 @@ export class ConfigManager {
101113
if (!config.region) {
102114
config.region = 'cn'
103115
}
116+
// 确保 lowVolumeMode 存在
117+
if (typeof config.lowVolumeMode !== 'boolean') {
118+
config.lowVolumeMode = defaultConfig.asr.lowVolumeMode
119+
}
104120
return config
105121
}
106122

0 commit comments

Comments
 (0)