Skip to content

Commit 8b77a1b

Browse files
committed
feat(refine): add remote glossary refresh
1 parent 3520e01 commit 8b77a1b

14 files changed

Lines changed: 10243 additions & 8645 deletions

electron/main/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Electron 主进程目录,负责窗口管理、IPC、录音编排、ASR/润色
1313
- `hotkey-manager.ts` - Electron `globalShortcut` 管理。
1414
- `iohook-manager.ts` - `uiohook-napi` 键盘监听。
1515
- `asr-provider.ts` - GLM ASR API 封装,支持 `prompt``request_id`
16-
- `refine/` - 文本润色模块,使用 OpenAI-compatible Chat Completions 做后处理、动态 prompt 组装与连接校验
16+
- `refine/` - 文本润色模块,使用 OpenAI-compatible Chat Completions 做后处理、动态 prompt 组装、远程术语表缓存刷新与连接校验
1717
- `text-injector.ts` - 基于 `@nut-tree-fork/nut-js` 的文本注入,优先保证多行文本的换行保真。
1818
- `updater-manager.ts` - GitHub Releases 更新检查。
1919
- `audio/` - 录音会话与分段转写流水线。

electron/main/ipc/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
## 文件
66

77
- `index.ts` - IPC 处理器注册入口与依赖初始化。
8-
- `config-handlers.ts` - 配置读写、ASR/润色连接校验、语言快照查询与广播。
8+
- `config-handlers.ts` - 配置读写、ASR/润色连接校验、语言快照查询与广播;在文本润色从关闭切到开启时触发一次远程术语表刷新
99
- `session-handlers.ts` - 录音会话相关处理器,包括开始、停止、状态、音频分段接收与取消。
1010
- `history-handlers.ts` - 历史记录获取、删除与清空。
1111
- `log-handlers.ts` - 日志读取、写入与打开目录。

electron/main/ipc/config-handlers.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,13 @@ export function registerConfigHandlers(): void {
9696
deps.initializeASRProvider()
9797
}
9898
if (config.llmRefine) {
99+
const previousRefineConfig = configManager.getLLMRefineConfig()
99100
configManager.setLLMRefineConfig(config.llmRefine)
101+
const nextRefineConfig = configManager.getLLMRefineConfig()
102+
if (!previousRefineConfig.enabled && nextRefineConfig.enabled) {
103+
const refineService = deps.getRefineService()
104+
void refineService?.refreshRemoteGlossary()
105+
}
100106
}
101107
if (config.hotkey) {
102108
configManager.setHotkeyConfig(config.hotkey)

electron/main/main.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,14 @@ function initializeRefineService() {
111111
})
112112
}
113113

114+
function refreshRemoteGlossaryIfEnabled(): void {
115+
if (!refineService?.isEnabled()) {
116+
return
117+
}
118+
119+
void refineService.refreshRemoteGlossary()
120+
}
121+
114122
function willRunRefine(): boolean {
115123
return Boolean(refineService?.isEnabled() && refineService.hasValidConfig())
116124
}
@@ -158,6 +166,7 @@ app.whenReady().then(async () => {
158166
// 初始化ASR Provider
159167
initializeASRProvider()
160168
initializeRefineService()
169+
refreshRemoteGlossaryIfEnabled()
161170
// 创建后台窗口
162171
createBackgroundWindow()
163172
// 创建托盘

electron/main/refine/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
## 文件
66

77
- `index.ts` - 统一导出润色服务、OpenAI-compatible client 与配置解析工具。
8-
- `service.ts` - 无状态 `RefineService`,每次调用都读取最新 `llmRefine` 配置,使用固定 transcript 包装、静态术语表感知与结构化 prompt 执行润色和连接校验。
9-
- `config-resolver.ts` - 将手动填写的润色 Base URL 归一化后补全为 `/chat/completions` 请求参数,并按润色配置追加整体输出英文模式的 system prompt 覆盖段。
8+
- `service.ts` - `RefineService` 维护内存术语表缓存;每次调用都读取最新 `llmRefine` 配置,使用固定 transcript 包装、当前术语表感知与结构化 prompt 执行润色和连接校验,并支持显式刷新远程术语表。
9+
- `glossary-cache.ts` - 以内置术语表初始化内存缓存,按需拉取远程纯文本术语表,做 UTF-8、空行/注释过滤、去重与失败回退。
10+
- `config-resolver.ts` - 将手动填写的润色 Base URL 归一化后补全为 `/chat/completions` 请求参数,并按润色配置与传入术语表生成最终 system prompt。
1011
- `openai-client.ts` - OpenAI Chat Completions HTTP client,负责请求发送、错误与消息内容解析。

electron/main/refine/config-resolver.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,13 @@ export interface ResolvedRefineRequestConfig {
1212
systemPrompt: string
1313
}
1414

15+
export interface ResolveRefineRequestConfigOptions {
16+
glossaryTerms?: readonly string[]
17+
}
18+
1519
export function resolveRefineRequestConfig(
1620
refineConfig: LLMRefineConfig,
21+
options: ResolveRefineRequestConfigOptions = {},
1722
): ResolvedRefineRequestConfig | null {
1823
const baseUrl = normalizeRefineBaseUrl(refineConfig.endpoint)
1924
const endpoint = buildRefineChatEndpoint(baseUrl)
@@ -32,6 +37,7 @@ export function resolveRefineRequestConfig(
3237
maxTokens: OPENAI_CHAT.MAX_TOKENS,
3338
temperature: OPENAI_CHAT.TEMPERATURE,
3439
systemPrompt: buildRefineSystemPrompt({
40+
glossaryTerms: options.glossaryTerms,
3541
translateToEnglish: refineConfig.translateToEnglish,
3642
}),
3743
}
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
import axios from 'axios'
2+
import { REFINE_GLOSSARY_REMOTE, REFINE_GLOSSARY_TERMS } from '../../shared/constants'
3+
4+
const COMMENT_PREFIX = '#'
5+
const FALLBACK_GLOSSARY_TERMS = [...REFINE_GLOSSARY_TERMS]
6+
const SUPPORTED_CONTENT_TYPES = ['text/plain', 'application/octet-stream'] as const
7+
8+
function normalizeGlossaryTerms(terms: readonly string[]): string[] {
9+
return Array.from(new Set(terms.map((term) => term.trim()).filter((term) => term.length > 0)))
10+
}
11+
12+
function getContentType(headers: unknown): string | null {
13+
if (!headers || typeof headers !== 'object') {
14+
return null
15+
}
16+
17+
const contentType = (headers as Record<string, unknown>)['content-type']
18+
if (typeof contentType === 'string') {
19+
return contentType
20+
}
21+
22+
if (Array.isArray(contentType)) {
23+
const [firstContentType] = contentType
24+
return typeof firstContentType === 'string' ? firstContentType : null
25+
}
26+
27+
return null
28+
}
29+
30+
function isSupportedContentType(contentType: string | null): boolean {
31+
if (!contentType) {
32+
return true
33+
}
34+
35+
const normalizedContentType = contentType.toLowerCase()
36+
return SUPPORTED_CONTENT_TYPES.some((supportedType) =>
37+
normalizedContentType.includes(supportedType),
38+
)
39+
}
40+
41+
function parseRemoteGlossaryText(rawText: string): string[] {
42+
const normalizedText = rawText.replace(/^\uFEFF/, '')
43+
const glossaryTerms = normalizeGlossaryTerms(
44+
normalizedText
45+
.split(/\r?\n/u)
46+
.map((line) => line.trim())
47+
.filter((line) => line.length > 0 && !line.startsWith(COMMENT_PREFIX)),
48+
)
49+
50+
if (glossaryTerms.length === 0) {
51+
throw new Error('Remote glossary is empty')
52+
}
53+
54+
return glossaryTerms
55+
}
56+
57+
export class RefineGlossaryCache {
58+
private glossaryTerms: string[]
59+
60+
constructor() {
61+
this.glossaryTerms = normalizeGlossaryTerms(FALLBACK_GLOSSARY_TERMS)
62+
}
63+
64+
getTerms(): readonly string[] {
65+
return this.glossaryTerms
66+
}
67+
68+
resetToFallback(): void {
69+
this.glossaryTerms = normalizeGlossaryTerms(FALLBACK_GLOSSARY_TERMS)
70+
}
71+
72+
async refreshFromRemote(): Promise<readonly string[]> {
73+
const response = await axios.get<string>(REFINE_GLOSSARY_REMOTE.URL, {
74+
headers: {
75+
Accept: 'text/plain',
76+
},
77+
responseEncoding: 'utf8',
78+
responseType: 'text',
79+
timeout: REFINE_GLOSSARY_REMOTE.TIMEOUT_MS,
80+
})
81+
82+
const contentType = getContentType(response.headers)
83+
if (!isSupportedContentType(contentType)) {
84+
throw new Error(`Unexpected glossary content type: ${contentType}`)
85+
}
86+
87+
if (typeof response.data !== 'string') {
88+
throw new Error('Remote glossary response is not plain text')
89+
}
90+
91+
const glossaryTerms = parseRemoteGlossaryText(response.data)
92+
this.glossaryTerms = glossaryTerms
93+
94+
return this.glossaryTerms
95+
}
96+
}

electron/main/refine/index.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,8 @@ export {
55
requestChatCompletion,
66
type OpenAIResponse,
77
} from './openai-client'
8-
export { resolveRefineRequestConfig, type ResolvedRefineRequestConfig } from './config-resolver'
8+
export {
9+
resolveRefineRequestConfig,
10+
type ResolvedRefineRequestConfig,
11+
type ResolveRefineRequestConfigOptions,
12+
} from './config-resolver'

electron/main/refine/service.ts

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,14 @@ import {
66
requestChatCompletion,
77
} from './openai-client'
88
import { resolveRefineRequestConfig } from './config-resolver'
9+
import { RefineGlossaryCache } from './glossary-cache'
910

1011
export interface TextRefiner {
1112
isEnabled: () => boolean
1213
hasValidConfig: (configOverride?: LLMRefineConfig) => boolean
1314
refineText: (input: string) => Promise<string>
1415
testConnection: (configOverride: LLMRefineConfig) => Promise<RefineConnectionResult>
16+
refreshRemoteGlossary: () => Promise<void>
1517
}
1618

1719
export interface RefineServiceDeps {
@@ -33,9 +35,11 @@ function buildTranscriptUserMessage(input: string): string {
3335

3436
export class RefineService implements TextRefiner {
3537
private deps: RefineServiceDeps
38+
private glossaryCache: RefineGlossaryCache
3639

3740
constructor(deps: RefineServiceDeps) {
3841
this.deps = deps
42+
this.glossaryCache = new RefineGlossaryCache()
3943
}
4044

4145
isEnabled(): boolean {
@@ -142,7 +146,22 @@ export class RefineService implements TextRefiner {
142146
}
143147
}
144148

149+
async refreshRemoteGlossary(): Promise<void> {
150+
try {
151+
const glossaryTerms = await this.glossaryCache.refreshFromRemote()
152+
console.info(
153+
`[RefineService] Remote glossary refreshed successfully with ${glossaryTerms.length} terms`,
154+
)
155+
} catch (error: unknown) {
156+
this.glossaryCache.resetToFallback()
157+
const message = error instanceof Error ? error.message : 'Unknown error'
158+
console.warn(`[RefineService] Failed to refresh remote glossary, using fallback: ${message}`)
159+
}
160+
}
161+
145162
private resolveConfig(configOverride?: LLMRefineConfig) {
146-
return resolveRefineRequestConfig(configOverride ?? this.deps.getRefineConfig())
163+
return resolveRefineRequestConfig(configOverride ?? this.deps.getRefineConfig(), {
164+
glossaryTerms: this.glossaryCache.getTerms(),
165+
})
147166
}
148167
}

electron/shared/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@
55
## 文件列表
66

77
- `types.ts` - 跨进程类型定义与 IPC 通道常量;包含语言快照、配置、支持识别/润色两步 HUD 处理阶段的 Overlay、历史、日志、润色输出英文开关、`RecordingStartPayload``AudioChunkPayload`
8-
- `constants.ts` - GLM ASR / 文本润色默认值、静态术语表驱动的 refine system prompt(保留原始结构化润色规则,并可按配置追加整体输出英文模式,翻译时优先准确传达意思而非逐字直译)、中英混排空格规则、29 秒单请求限制、3 分钟会话限制、默认快捷键、录音参数与日志限制。
8+
- `constants.ts` - GLM ASR / 文本润色默认值、内置术语表回退值与远程术语表源配置、refine system prompt 构造(可按配置追加整体输出英文模式,翻译时优先准确传达意思而非逐字直译)、中英混排空格规则、29 秒单请求限制、3 分钟会话限制、默认快捷键、录音参数与日志限制。
9+
- `refine-glossary.txt` - 远程术语表的本地维护源文件,按“每行一个术语”组织,支持 `#` 注释行与 UTF-8 文本上传到 R2。
910
- `refine-url.ts` - 文本润色 Base URL 归一化与 `/chat/completions` 请求地址拼装工具。
1011
- `i18n.ts` - 共享 i18n 资源与语言解析工具。
1112
- `locales/en.json` - 英文文案资源。

0 commit comments

Comments
 (0)