Skip to content

Commit e674113

Browse files
authored
Merge pull request #317 from mason5052/codex/issue-314-deepseek-v4-models
fix(deepseek): update default model names to DeepSeek V4
2 parents ea7b415 + 24176c2 commit e674113

8 files changed

Lines changed: 166 additions & 131 deletions

File tree

README.md

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1968,23 +1968,28 @@ DEEPSEEK_SERVER_URL=https://api.deepseek.com
19681968
# With LiteLLM proxy
19691969
DEEPSEEK_API_KEY=your_litellm_key
19701970
DEEPSEEK_SERVER_URL=http://litellm-proxy:4000
1971-
DEEPSEEK_PROVIDER=deepseek # Adds prefix to model names (deepseek/deepseek-chat) for LiteLLM
1971+
DEEPSEEK_PROVIDER=deepseek # Adds prefix to model names (deepseek/deepseek-v4-flash) for LiteLLM
19721972
```
19731973

19741974
#### Supported Models
19751975

1976-
PentAGI supports 2 DeepSeek-V3.2 models with tool calling, streaming, thinking modes, and context caching. Both models are used in default configuration.
1976+
PentAGI supports 2 DeepSeek V4 models with tool calling, streaming, thinking modes, and context caching. Models marked with `*` are used in default configuration.
19771977

1978-
| Model ID | Thinking | Context | Max Output | Price (Input/Output/Cache) | Use Case |
1979-
| --------------------- | -------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |
1980-
| `deepseek-chat`* | ❌ | 128K | 8K | $0.28/$0.42/$0.03 | General dialogue, code generation, tool calling |
1981-
| `deepseek-reasoner`* | ✅ | 128K | 64K | $0.28/$0.42/$0.03 | Advanced reasoning, complex logic, security analysis |
1978+
| Model ID | Thinking | Context | Price (Input/Output/Cache) | Use Case |
1979+
| --------------------- | -------- | ------- | -------------------------- | ---------------------------------------------------- |
1980+
| `deepseek-v4-flash`* | ❌ | 1M | $0.14/$0.28/$0.0028 | General dialogue, code generation, tool calling |
1981+
| `deepseek-v4-pro`* | ✅ | 1M | $0.435/$0.87/$0.003625 | Advanced reasoning, complex logic, security analysis |
19821982

1983-
**Prices**: Per 1M tokens. Cache pricing is for prompt caching (10% of input cost). Models with thinking support include reinforcement learning chain-of-thought reasoning.
1983+
**Prices**: Per 1M tokens. Cache pricing applies to prompt tokens served from cache and is heavily discounted versus input price. Models with thinking support include reinforcement learning chain-of-thought reasoning.
1984+
1985+
> The legacy model names `deepseek-chat` and `deepseek-reasoner` are scheduled
1986+
> for deprecation by DeepSeek on 2026-07-24. Existing user configurations
1987+
> referencing the legacy names continue to work until then; the defaults above
1988+
> use the current V4 names.
19841989

19851990
**Key Features**:
1986-
- **Automatic Prompt Caching**: 40-60% cost reduction on repeated context (10% of input price)
1987-
- **Extended Thinking**: Reinforcement learning CoT for complex security analysis (deepseek-reasoner)
1991+
- **Automatic Prompt Caching**: Significant cost reduction on repeated context via cache-hit pricing far below input price
1992+
- **Extended Thinking**: Reinforcement learning CoT for complex security analysis (deepseek-v4-pro)
19881993
- **Strong Coding**: Optimized for code generation and exploit development
19891994
- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling
19901995
- **Streaming**: Real-time response streaming for interactive workflows
@@ -2967,7 +2972,7 @@ With `LLM_SERVER_PROVIDER=moonshot`, the system automatically prefixes all model
29672972

29682973
When using LiteLLM proxy, set the corresponding `*_PROVIDER` variable to enable model prefixing:
29692974

2970-
- `deepseek` - for DeepSeek models (`DEEPSEEK_PROVIDER=deepseek``deepseek/deepseek-chat`)
2975+
- `deepseek` - for DeepSeek models (`DEEPSEEK_PROVIDER=deepseek``deepseek/deepseek-v4-flash`)
29712976
- `zai` - for GLM models (`GLM_PROVIDER=zai``zai/glm-4`)
29722977
- `moonshot` - for Kimi models (`KIMI_PROVIDER=moonshot``moonshot/kimi-k2.5`)
29732978
- `dashscope` - for Qwen models (`QWEN_PROVIDER=dashscope``dashscope/qwen-plus`)
@@ -2982,7 +2987,7 @@ When using LiteLLM proxy, set the corresponding `*_PROVIDER` variable to enable
29822987
# Use DeepSeek models via LiteLLM proxy with model prefixing
29832988
DEEPSEEK_API_KEY=your_litellm_proxy_key
29842989
DEEPSEEK_SERVER_URL=http://litellm-proxy:4000
2985-
DEEPSEEK_PROVIDER=deepseek # Models become deepseek/deepseek-chat, deepseek/deepseek-reasoner for LiteLLM
2990+
DEEPSEEK_PROVIDER=deepseek # Models become deepseek/deepseek-v4-flash, deepseek/deepseek-v4-pro for LiteLLM
29862991

29872992
# Direct DeepSeek API usage (no prefix needed)
29882993
DEEPSEEK_API_KEY=your_deepseek_api_key

backend/cmd/installer/wizard/locale/locale.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -495,8 +495,8 @@ Setup options: Local installation from https://10.10.10.10:11434 or cloud regist
495495
LLMFormDeepSeekHelp = `DeepSeek provides advanced AI models with strong reasoning capabilities and multilingual support.
496496
497497
Default PentAGI Models:
498-
DeepSeek-Chat: Flagship model for general-purpose tasks with strong coding and reasoning capabilities
499-
DeepSeek-Reasoner: Advanced reasoning model for complex security analysis
498+
deepseek-v4-flash: Cost-efficient general-purpose model for dialogue, code generation, and tool calling
499+
deepseek-v4-pro: Higher-tier reasoning model for complex logic, mathematical reasoning, and security analysis
500500
• Cost-effective pricing with competitive performance compared to leading models
501501
502502
Key Advantages:
@@ -507,7 +507,7 @@ Key Advantages:
507507
508508
LiteLLM Integration:
509509
• Set Provider Name to 'deepseek' when using LiteLLM proxy
510-
• Enables model prefix (e.g., deepseek/deepseek-chat) without modifying config.yml
510+
• Enables model prefix (e.g., deepseek/deepseek-v4-flash) without modifying config.yml
511511
• Optional for direct DeepSeek API usage
512512
513513
Best for: Teams requiring multilingual support, cost-conscious deployments, Chinese language security testing

backend/docs/config.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -610,7 +610,7 @@ These settings control the integration with various Large Language Model (LLM) p
610610
| DeepSeekServerURL | `DEEPSEEK_SERVER_URL` | `https://api.deepseek.com` | DeepSeek API endpoint URL |
611611
| DeepSeekProvider | `DEEPSEEK_PROVIDER` | *(none)* | Provider name prefix for LiteLLM integration (optional) |
612612

613-
**LiteLLM Integration**: Set `DEEPSEEK_PROVIDER=deepseek` to enable model prefixing (e.g., `deepseek/deepseek-chat`) when using LiteLLM proxy with default PentAGI configs.
613+
**LiteLLM Integration**: Set `DEEPSEEK_PROVIDER=deepseek` to enable model prefixing (e.g., `deepseek/deepseek-v4-flash`) when using LiteLLM proxy with default PentAGI configs.
614614

615615
### GLM LLM Provider
616616

backend/docs/llms_how_to.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1196,7 +1196,7 @@ llm, _ := openai.New(
11961196
)
11971197

11981198
resp, _ := llm.GenerateContent(ctx, messages,
1199-
llms.WithModel("deepseek-reasoner"),
1199+
llms.WithModel("deepseek-v4-pro"),
12001200
)
12011201

12021202
// Reasoning extracted from <think>...</think> tags automatically
Lines changed: 67 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,127 +1,142 @@
11
simple:
2-
model: deepseek-chat
2+
model: deepseek-v4-flash
33
temperature: 0.5
44
top_p: 0.5
55
n: 1
66
max_tokens: 8192
7+
extra_body:
8+
thinking:
9+
type: disabled
710
price:
8-
input: 0.28
9-
output: 0.42
10-
cache_read: 0.028
11+
input: 0.14
12+
output: 0.28
13+
cache_read: 0.0028
1114

1215
simple_json:
13-
model: deepseek-chat
16+
model: deepseek-v4-flash
1417
temperature: 0.5
1518
top_p: 0.5
1619
n: 1
1720
max_tokens: 4096
1821
json: true
22+
extra_body:
23+
thinking:
24+
type: disabled
1925
price:
20-
input: 0.28
21-
output: 0.42
22-
cache_read: 0.028
26+
input: 0.14
27+
output: 0.28
28+
cache_read: 0.0028
2329

2430
primary_agent:
25-
model: deepseek-reasoner
31+
model: deepseek-v4-pro
2632
n: 1
2733
max_tokens: 16384
2834
price:
29-
input: 0.28
30-
output: 0.42
31-
cache_read: 0.028
35+
input: 0.435
36+
output: 0.87
37+
cache_read: 0.003625
3238

3339
assistant:
34-
model: deepseek-reasoner
40+
model: deepseek-v4-pro
3541
n: 1
3642
max_tokens: 16384
3743
price:
38-
input: 0.28
39-
output: 0.42
40-
cache_read: 0.028
44+
input: 0.435
45+
output: 0.87
46+
cache_read: 0.003625
4147

4248
generator:
43-
model: deepseek-reasoner
49+
model: deepseek-v4-pro
4450
n: 1
4551
max_tokens: 32768
4652
price:
47-
input: 0.28
48-
output: 0.42
49-
cache_read: 0.028
53+
input: 0.435
54+
output: 0.87
55+
cache_read: 0.003625
5056

5157
refiner:
52-
model: deepseek-reasoner
58+
model: deepseek-v4-pro
5359
n: 1
5460
max_tokens: 20480
5561
price:
56-
input: 0.28
57-
output: 0.42
58-
cache_read: 0.028
62+
input: 0.435
63+
output: 0.87
64+
cache_read: 0.003625
5965

6066
adviser:
61-
model: deepseek-chat
67+
model: deepseek-v4-flash
6268
temperature: 0.7
6369
top_p: 0.8
6470
n: 1
6571
max_tokens: 8192
72+
extra_body:
73+
thinking:
74+
type: disabled
6675
price:
67-
input: 0.28
68-
output: 0.42
69-
cache_read: 0.028
76+
input: 0.14
77+
output: 0.28
78+
cache_read: 0.0028
7079

7180
reflector:
72-
model: deepseek-reasoner
81+
model: deepseek-v4-pro
7382
n: 1
7483
max_tokens: 4096
7584
price:
76-
input: 0.28
77-
output: 0.42
78-
cache_read: 0.028
85+
input: 0.435
86+
output: 0.87
87+
cache_read: 0.003625
7988

8089
searcher:
81-
model: deepseek-chat
90+
model: deepseek-v4-flash
8291
temperature: 0.7
8392
top_p: 0.8
8493
n: 1
8594
max_tokens: 4096
95+
extra_body:
96+
thinking:
97+
type: disabled
8698
price:
87-
input: 0.28
88-
output: 0.42
89-
cache_read: 0.028
99+
input: 0.14
100+
output: 0.28
101+
cache_read: 0.0028
90102

91103
enricher:
92-
model: deepseek-chat
104+
model: deepseek-v4-flash
93105
temperature: 0.7
94106
top_p: 0.8
95107
n: 1
96108
max_tokens: 4096
109+
extra_body:
110+
thinking:
111+
type: disabled
97112
price:
98-
input: 0.28
99-
output: 0.42
100-
cache_read: 0.028
113+
input: 0.14
114+
output: 0.28
115+
cache_read: 0.0028
101116

102117
coder:
103-
model: deepseek-reasoner
118+
model: deepseek-v4-pro
104119
n: 1
105120
max_tokens: 20480
106121
price:
107-
input: 0.28
108-
output: 0.42
109-
cache_read: 0.028
122+
input: 0.435
123+
output: 0.87
124+
cache_read: 0.003625
110125

111126
installer:
112-
model: deepseek-reasoner
127+
model: deepseek-v4-pro
113128
n: 1
114129
max_tokens: 16384
115130
price:
116-
input: 0.28
117-
output: 0.42
118-
cache_read: 0.028
131+
input: 0.435
132+
output: 0.87
133+
cache_read: 0.003625
119134

120135
pentester:
121-
model: deepseek-reasoner
136+
model: deepseek-v4-pro
122137
n: 1
123138
max_tokens: 16384
124139
price:
125-
input: 0.28
126-
output: 0.42
127-
cache_read: 0.028
140+
input: 0.435
141+
output: 0.87
142+
cache_read: 0.003625

backend/pkg/providers/deepseek/deepseek.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ import (
1919
//go:embed config.yml models.yml
2020
var configFS embed.FS
2121

22-
const DeepSeekAgentModel = "deepseek-chat"
22+
const DeepSeekAgentModel = "deepseek-v4-flash"
2323

2424
const DeepSeekToolCallIDTemplate = "call_{r:2:d}_{r:24:b}"
2525

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
- name: deepseek-chat
2-
description: DeepSeek-V3.2 (Non-thinking Mode) - Suitable for general dialogue, code generation, and tool calling tasks. Supports JSON Output, Tool Calls, Chat Prefix Completion, and FIM Completion. 128K context, max output 8K
1+
- name: deepseek-v4-flash
2+
description: DeepSeek V4 Flash - Cost-efficient general-purpose model suitable for dialogue, code generation, and tool calling. Supports JSON output and tool calls. 1M context, up to 384K output tokens.
33
thinking: false
44
price:
5-
input: 0.28
6-
output: 0.42
7-
cache_read: 0.028
5+
input: 0.14
6+
output: 0.28
7+
cache_read: 0.0028
88

9-
- name: deepseek-reasoner
10-
description: DeepSeek-V3.2 (Thinking Mode) - Advanced reasoning model with reinforcement learning chain-of-thought capabilities, suitable for complex logic, mathematical reasoning, and security analysis tasks. 128K context, max output 64K
9+
- name: deepseek-v4-pro
10+
description: DeepSeek V4 Pro - Higher-tier reasoning model suitable for complex logic, mathematical reasoning, and security analysis. 1M context, up to 384K output tokens.
1111
thinking: true
1212
price:
13-
input: 0.28
14-
output: 0.42
15-
cache_read: 0.028
13+
input: 0.435
14+
output: 0.87
15+
cache_read: 0.003625

0 commit comments

Comments
 (0)