Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -178,10 +178,10 @@ GOOGLE_KEY=user_provided
# GOOGLE_AUTH_HEADER=true

# Gemini API (AI Studio)
# GOOGLE_MODELS=gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.0-flash,gemini-2.0-flash-lite
# GOOGLE_MODELS=gemini-3-flash-preview,gemini-3-pro-preview,gemini-3-pro-image-preview,gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.0-flash,gemini-2.0-flash-lite

# Vertex AI
# GOOGLE_MODELS=gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.0-flash-001,gemini-2.0-flash-lite-001
# GOOGLE_MODELS=gemini-3-flash-preview,gemini-3-pro-preview,gemini-3-pro-image-preview,gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.0-flash-001,gemini-2.0-flash-lite-001

# GOOGLE_TITLE_MODEL=gemini-2.0-flash-lite-001

Expand Down
197 changes: 197 additions & 0 deletions add-kimi-bedrock.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# Add AWS Bedrock Kimi K2 Thinking (Moonshot) Support

This document describes the code changes required to add **Moonshot Kimi K2 Thinking** (`moonshot.kimi-k2-thinking`) model support to LibreChat's Bedrock endpoint.

## Model Information

- **Provider**: Moonshot AI
- **Model ID**: `moonshot.kimi-k2-thinking`
- **Context Window**: 256K tokens
- **Recommended Max Output Tokens**: 16,384+ (for full reasoning + output)
- **Recommended Temperature**: 1.0
- **AWS Regions**: `ap-northeast-1`, `ap-south-1`, and more

## Files to Modify

### 1. `packages/data-provider/src/schemas.ts`

**Location**: Around line 98-100, inside the `BedrockProviders` enum.

**Change**: Add `Moonshot = 'moonshot'` to the enum.

**Before**:
```typescript
export enum BedrockProviders {
AI21 = 'ai21',
Amazon = 'amazon',
Anthropic = 'anthropic',
Cohere = 'cohere',
Meta = 'meta',
MistralAI = 'mistral',
StabilityAI = 'stability',
DeepSeek = 'deepseek',
}
```

**After**:
```typescript
export enum BedrockProviders {
AI21 = 'ai21',
Amazon = 'amazon',
Anthropic = 'anthropic',
Cohere = 'cohere',
Meta = 'meta',
MistralAI = 'mistral',
Moonshot = 'moonshot',
StabilityAI = 'stability',
DeepSeek = 'deepseek',
}
```

---

### 2. `packages/data-provider/src/parameterSettings.ts`

**Change 1**: Add Moonshot configuration arrays after `bedrockGeneralCol2` (around line 880).

**Insert this code block after the `bedrockGeneralCol2` definition**:

```typescript
const bedrockMoonshot: SettingsConfiguration = [
librechat.modelLabel,
bedrock.system,
librechat.maxContextTokens,
createDefinition(bedrock.maxTokens, {
default: 16384,
}),
bedrock.temperature,
bedrock.topP,
baseDefinitions.stop,
librechat.resendFiles,
bedrock.region,
librechat.fileTokenLimit,
];

const bedrockMoonshotCol1: SettingsConfiguration = [
baseDefinitions.model as SettingDefinition,
librechat.modelLabel,
bedrock.system,
baseDefinitions.stop,
];

const bedrockMoonshotCol2: SettingsConfiguration = [
librechat.maxContextTokens,
createDefinition(bedrock.maxTokens, {
default: 16384,
}),
bedrock.temperature,
bedrock.topP,
librechat.resendFiles,
bedrock.region,
librechat.fileTokenLimit,
];
```

---

**Change 2**: Register in `paramSettings` object.

**Location**: Inside `export const paramSettings = { ... }`, after the `DeepSeek` entry.

**Add this line**:
```typescript
[`${EModelEndpoint.bedrock}-${BedrockProviders.Moonshot}`]: bedrockMoonshot,
```

**Context** (showing surrounding lines):
```typescript
[`${EModelEndpoint.bedrock}-${BedrockProviders.AI21}`]: bedrockGeneral,
[`${EModelEndpoint.bedrock}-${BedrockProviders.Amazon}`]: bedrockGeneral,
[`${EModelEndpoint.bedrock}-${BedrockProviders.DeepSeek}`]: bedrockGeneral,
[`${EModelEndpoint.bedrock}-${BedrockProviders.Moonshot}`]: bedrockMoonshot, // <-- ADD THIS
[EModelEndpoint.google]: googleConfig,
```

---

**Change 3**: Register in `presetSettings` object.

**Location**: Inside `export const presetSettings = { ... }`, after the `DeepSeek` entry.

**Add this block**:
```typescript
[`${EModelEndpoint.bedrock}-${BedrockProviders.Moonshot}`]: {
col1: bedrockMoonshotCol1,
col2: bedrockMoonshotCol2,
},
```

**Context** (showing surrounding lines):
```typescript
[`${EModelEndpoint.bedrock}-${BedrockProviders.AI21}`]: bedrockGeneralColumns,
[`${EModelEndpoint.bedrock}-${BedrockProviders.Amazon}`]: bedrockGeneralColumns,
[`${EModelEndpoint.bedrock}-${BedrockProviders.DeepSeek}`]: bedrockGeneralColumns,
[`${EModelEndpoint.bedrock}-${BedrockProviders.Moonshot}`]: { // <-- ADD THIS BLOCK
col1: bedrockMoonshotCol1,
col2: bedrockMoonshotCol2,
},
[EModelEndpoint.google]: {
col1: googleCol1,
col2: googleCol2,
},
```

---

### 3. `packages/data-provider/src/bedrock.ts`

**Location**: Around line 137-140, inside the `bedrockInputParser` transform function.

**Change**: Wrap the `anthropic_beta` assignment in a conditional to only apply to Anthropic models.

**Before**:
```typescript
if (additionalFields.thinking === true && additionalFields.thinkingBudget === undefined) {
additionalFields.thinkingBudget = 2000;
}
additionalFields.anthropic_beta = ['output-128k-2025-02-19'];
} else if (additionalFields.thinking != null || additionalFields.thinkingBudget != null) {
```

**After**:
```typescript
if (additionalFields.thinking === true && additionalFields.thinkingBudget === undefined) {
additionalFields.thinkingBudget = 2000;
}
// Only add anthropic_beta for Anthropic models
if (typedData.model.includes('anthropic.')) {
additionalFields.anthropic_beta = ['output-128k-2025-02-19'];
}
} else if (additionalFields.thinking != null || additionalFields.thinkingBudget != null) {
```

---

## Build Commands

After making changes, rebuild the packages:

```bash
npm run build:packages
```

---

## Notes

- **Kimi K2 Thinking** handles reasoning internally via `reasoning_content` - there's no `thinking` toggle like Anthropic Claude.
- **Prompt caching** is NOT yet supported on Bedrock for Kimi (only Claude and Nova models).
- **topK** is not supported by Kimi's API.
- The `max_tokens` default is set to 16,384 to accommodate the model's reasoning chain + final output.

---

## References

- [AWS Bedrock Supported Models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html)
- [Moonshot Kimi K2 API Documentation](https://platform.moonshot.ai/docs/api/chat)
7 changes: 3 additions & 4 deletions api/models/tx.js
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,6 @@ const tokenValues = Object.assign(
'gemma-2': { prompt: 0.01, completion: 0.03 }, // Base pattern (using gemma-2-9b pricing)
'gemma-3': { prompt: 0.02, completion: 0.04 }, // Base pattern (using gemma-3n-e4b pricing)
'gemma-3-27b': { prompt: 0.09, completion: 0.16 },
'gemini-1.5': { prompt: 2.5, completion: 10 },
'gemini-1.5-flash': { prompt: 0.15, completion: 0.6 },
'gemini-1.5-flash-8b': { prompt: 0.075, completion: 0.3 },
'gemini-2.0': { prompt: 0.1, completion: 0.4 }, // Base pattern (using 2.0-flash pricing)
'gemini-2.0-flash': { prompt: 0.1, completion: 0.4 },
'gemini-2.0-flash-lite': { prompt: 0.075, completion: 0.3 },
Expand All @@ -162,8 +159,10 @@ const tokenValues = Object.assign(
'gemini-2.5-pro': { prompt: 1.25, completion: 10 },
'gemini-2.5-flash-image': { prompt: 0.15, completion: 30 },
'gemini-3': { prompt: 2, completion: 12 },
'gemini-3-flash-preview': { prompt: 0.5, completion: 3 },
'gemini-3-pro-preview': { prompt: 2, completion: 12 },
'gemini-3-pro-image-preview': { prompt: 2, completion: 120 },
'gemini-3-pro-image': { prompt: 2, completion: 120 },
'gemini-pro-vision': { prompt: 0.5, completion: 1.5 },
grok: { prompt: 2.0, completion: 10.0 }, // Base pattern defaults to grok-2
'grok-beta': { prompt: 5.0, completion: 15.0 },
'grok-vision-beta': { prompt: 5.0, completion: 15.0 },
Expand Down
14 changes: 2 additions & 12 deletions api/models/tx.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -1164,11 +1164,6 @@ describe('Google Model Tests', () => {
'gemini-2.0-flash-001',
'gemini-2.0-flash-exp',
'gemini-2.0-pro-exp-02-05',
'gemini-1.5-flash-8b',
'gemini-1.5-flash-thinking',
'gemini-1.5-pro-latest',
'gemini-1.5-pro-preview-0409',
'gemini-pro-vision',
'gemini-1.0',
'gemini-pro',
];
Expand Down Expand Up @@ -1208,11 +1203,6 @@ describe('Google Model Tests', () => {
'gemini-2.0-flash-001': 'gemini-2.0-flash',
'gemini-2.0-flash-exp': 'gemini-2.0-flash',
'gemini-2.0-pro-exp-02-05': 'gemini-2.0',
'gemini-1.5-flash-8b': 'gemini-1.5-flash-8b',
'gemini-1.5-flash-thinking': 'gemini-1.5-flash',
'gemini-1.5-pro-latest': 'gemini-1.5',
'gemini-1.5-pro-preview-0409': 'gemini-1.5',
'gemini-pro-vision': 'gemini-pro-vision',
'gemini-1.0': 'gemini',
'gemini-pro': 'gemini',
};
Expand All @@ -1225,8 +1215,8 @@ describe('Google Model Tests', () => {

it('should handle model names with different formats', () => {
const testCases = [
{ input: 'google/gemini-pro', expected: 'gemini' },
{ input: 'gemini-pro/google', expected: 'gemini' },
{ input: 'google/gemini-2.5-flash', expected: 'gemini-2.5-flash' },
{ input: 'gemini-2.5-flash/google', expected: 'gemini-2.5-flash' },
{ input: 'google/gemini-2.0-flash-lite', expected: 'gemini-2.0-flash-lite' },
];

Expand Down
16 changes: 1 addition & 15 deletions api/utils/tokens.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -263,18 +263,6 @@ describe('getModelMaxTokens', () => {
expect(getModelMaxTokens('gemini-2.0-pro-exp-02-05', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-2.0'],
);
expect(getModelMaxTokens('gemini-1.5-flash-8b', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-1.5-flash-8b'],
);
expect(getModelMaxTokens('gemini-1.5-flash-thinking', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-1.5-flash'],
);
expect(getModelMaxTokens('gemini-1.5-pro-latest', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-1.5'],
);
expect(getModelMaxTokens('gemini-1.5-pro-preview-0409', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-1.5'],
);
expect(getModelMaxTokens('gemini-3', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-3'],
);
Expand All @@ -287,9 +275,7 @@ describe('getModelMaxTokens', () => {
expect(getModelMaxTokens('gemini-2.5-flash-lite', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-2.5-flash-lite'],
);
expect(getModelMaxTokens('gemini-pro-vision', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini-pro-vision'],
);
// Test fallback patterns for unknown variants
expect(getModelMaxTokens('gemini-1.0', EModelEndpoint.google)).toBe(
maxTokensMap[EModelEndpoint.google]['gemini'],
);
Expand Down
14 changes: 7 additions & 7 deletions client/src/utils/createChatSearchParams.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -215,27 +215,27 @@ describe('createChatSearchParams', () => {
it('handles float parameter values correctly', () => {
const result = createChatSearchParams({
endpoint: EModelEndpoint.google,
model: 'gemini-pro',
model: 'gemini-2.5-flash',
frequency_penalty: 0.25,
temperature: 0.75,
});

expect(result.get('endpoint')).toBe(EModelEndpoint.google);
expect(result.get('model')).toBe('gemini-pro');
expect(result.get('model')).toBe('gemini-2.5-flash');
expect(result.get('frequency_penalty')).toBe('0.25');
expect(result.get('temperature')).toBe('0.75');
});

it('handles integer parameter values correctly', () => {
const result = createChatSearchParams({
endpoint: EModelEndpoint.google,
model: 'gemini-pro',
model: 'gemini-2.5-flash',
topK: 40,
maxOutputTokens: 2048,
});

expect(result.get('endpoint')).toBe(EModelEndpoint.google);
expect(result.get('model')).toBe('gemini-pro');
expect(result.get('model')).toBe('gemini-2.5-flash');
expect(result.get('topK')).toBe('40');
expect(result.get('maxOutputTokens')).toBe('2048');
});
Expand All @@ -245,22 +245,22 @@ describe('createChatSearchParams', () => {
it('handles preset objects correctly', () => {
const preset: Partial<TPreset> = {
endpoint: EModelEndpoint.google,
model: 'gemini-pro',
model: 'gemini-2.5-flash',
temperature: 0.5,
topP: 0.8,
};

const result = createChatSearchParams(preset as TPreset);
expect(result.get('endpoint')).toBe(EModelEndpoint.google);
expect(result.get('model')).toBe('gemini-pro');
expect(result.get('model')).toBe('gemini-2.5-flash');
expect(result.get('temperature')).toBe('0.5');
expect(result.get('topP')).toBe('0.8');
});

it('returns only spec param when spec property is present', () => {
const preset: Partial<TPreset> = {
endpoint: EModelEndpoint.google,
model: 'gemini-pro',
model: 'gemini-2.5-flash',
temperature: 0.5,
spec: 'special_spec',
};
Expand Down
Loading
Loading