chore(*): culmulative AI GW fixes 20251028 #14785

fffonion · 2025-10-28T10:04:35Z

Summary

Checklist

The Pull Request has tests
A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
There is a user-facing docs PR against https://github.com/Kong/developer.konghq.com - PUT DOCS PR HERE

Issue reference

…e format (#12451) Added support for Bedrock agent SDK, including converse, converse-stream, retrieveAndGenerate, and retrieveAndGenerate-stream. AG-279

Increase ai prompt message max length from 500 to 100000. AG-287 Signed-off-by: hackerchai <[email protected]>

…est_body_table_inuse for fixing user defiend fields missing - Add should_set_body parameter to control request body setting - Update prompt decorator to use new parameter test(ai-prompt-decorator): add test for preserving model and temperature fields - Add test case for full chat request - Verify model and temperature preservation test(ai-prompt-decorator): add integration test for preserving model and temperature fields - Add test case for openai_full_chat configuration - Verify model, temperature and max_tokens preservation - Check message decoration and context setting doc(changelog): Add fix_ai_prompt_decorator_missing_fields changlog doc(changelog): use correct type of changelog & polish message Signed-off-by: Eason Chai <[email protected]>

This implementation drops the usage of splitn. Note that it assumes that each SSE chunk ends with `\n\n` which is expected (See #14499). To satify this requirement, I also update some tests. AG-310

…ded (#12505) in the report of analytics FTI-6691

From #14500 AG-309 Signed-off-by: spacewander <[email protected]> Signed-off-by: Zexuan Luo <[email protected]> Co-authored-by: spacewander <[email protected]> Co-authored-by: Zexuan Luo <[email protected]> Co-authored-by: Jun Ouyang <[email protected]>

…ddings

…return 403 AG-272

Previously, stale SSE events was not dropped, which causes repeated body (like `The answer to 1 + 1 is 2.The answer to 1 + 1 is 2.`) for observability. Signed-off-by: Zexuan Luo <[email protected]>

AG-199

AG-324

AG-329 --------- Signed-off-by: Zexuan Luo <[email protected]>

…ent was truncated (#13452) The previous fix (#13315) missed a branch. This is caught by a more careful fuzzing. Signed-off-by: Zexuan Luo <[email protected]> AG-385

…e was incomplete (#13430) FTI-6842 --------- Signed-off-by: Zexuan Luo <[email protected]>

…ferent http scheme (#13288) AG-379

…re events (#13588) AG-401 This affects Gemini streaming chunk parsing and OpenAI's /v1/files route. When using iterator in the `for` loop, the loop is terminated when the first returned value is nil, which causes the missing state update. Show by the code below: ``` local function itertool(x) local i = 0 return function() i = i + 1 if i <= #x then return x[i] end end end local function main() local x = {1, nil, 3, 4, 5} for v in itertool(x) do print(v) end end local function better_main() local x = {1, nil, 3, 4, 5} local iter = itertool(x) local eos = 5 local count = 0 while true do count = count + 1 if count > eos then break end local v = iter() if v ~= nil then print(v) end end end main() print("Fix it") better_main() ``` This PR also 1. Fixes an incorrect delimiter skipping 2. Supports using `\r` as line separator --------- Signed-off-by: Zexuan Luo <[email protected]>

…thropic provider (#13355) AG-391 --------- Signed-off-by: Zexuan Luo <[email protected]>

AG-415

…om Gemini provider in some situations Signed-off-by: Zexuan Luo <[email protected]>

…ar used as model name

… object AG-457

…ing in llm/v1/chat

…and happy path for e2e test (#13679) AG-412

…e test (#14153) AG-489

…gw-only] (#14137) "Floor" is set and then prompts must abide by specific rulesets (e.g. hate, violence) else it will be blocked. Kong was not correctly handling a "bad" or "blocked" response from GCP. This PR makes that work. With this patch, the user no longer gets 500 'an error occured' and instead gets 400:

…e_cache and wait_for_model options [aigw-only] (#14287) AG-506

oowl and others added 4 commits October 28, 2025 14:27

feat(llm): support bedrock converse and retrieveAndGenerate API nativ…

841b02d

…e format (#12451) Added support for Bedrock agent SDK, including converse, converse-stream, retrieveAndGenerate, and retrieveAndGenerate-stream. AG-279

fix(ai-prompt-decorator): increase max length for content field (#12515)

5a76275

Increase ai prompt message max length from 500 to 100000. AG-287 Signed-off-by: hackerchai <[email protected]>

fix(pdk): properly encode boolean value in multipart request

88aa378

github-actions bot assigned fffonion Oct 28, 2025

pull-request-size bot added the size/XXL label Oct 28, 2025

github-actions bot added core/pdk schema-change-noteworthy cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee plugins/ai-proxy plugins/ai-request-transformer plugins/ai-response-transformer plugins/ai-prompt-decorator and removed size/XXL labels Oct 28, 2025

pull-request-size bot added the size/XXL label Oct 28, 2025

fffonion force-pushed the aigw-393 branch 2 times, most recently from e6eb4cd to 95d2fa3 Compare October 30, 2025 05:54

fffonion and others added 13 commits October 30, 2025 17:11

chore(llm): move keybastion to shared module

700ae31

perf(llm): make parsing sse chunk 50% faster (#12574)

671f552

This implementation drops the usage of splitn. Note that it assumes that each SSE chunk ends with `\n\n` which is expected (See #14499). To satify this requirement, I also update some tests. AG-310

fix(analytics): fixed an issue where some of the ai metrics not inclu…

e23c303

…ded (#12505) in the report of analytics FTI-6691

chore(*): fix a typo in ai-proxy plugin

9172edb

fix(ai-bedrock): do not set api dimensionality for bedrock titan embe…

48989ac

…ddings

fix(ai): set bedrock single-string tool response in correct field

c69a22d

fix(ai): ollama correct handling of tools and timestamp formats

9906b70

fix(ai-proxy): fix AI-response-transformer plugin subrequest llm API …

ff19e02

…return 403 AG-272

fix(ai-proxy): resp for observability may be larger than the real one

81777ad

Previously, stale SSE events was not dropped, which causes repeated body (like `The answer to 1 + 1 is 2.The answer to 1 + 1 is 2.`) for observability. Signed-off-by: Zexuan Luo <[email protected]>

fix(ai-proxy): fix AI Plugins with multiple capture groups

41688fe

AG-199

fix(ai-proxy): Implement latency metric for streaming responses (#12837)

ff69219

AG-324

fix(ai-proxy): large request payload was not logged (#12915)

5f44094

AG-329 --------- Signed-off-by: Zexuan Luo <[email protected]>

fffonion and others added 29 commits October 30, 2025 17:11

fix(embeddings): fix gemini vertex embeddings

75a3168

fix(llm): AI Proxy might drop content in the response when the SSE ev…

8e0cf20

…ent was truncated (#13452) The previous fix (#13315) missed a branch. This is caught by a more careful fuzzing. Signed-off-by: Zexuan Luo <[email protected]> AG-385

fix(ai-proxy): aws stream parser didn't parse correctly when the fram…

5b8ba71

…e was incomplete (#13430) FTI-6842 --------- Signed-off-by: Zexuan Luo <[email protected]>

fix(llm): fix ai driver set wrong target port when upstream using dif…

073288f

…ferent http scheme (#13288) AG-379

fix(ai): gemini and anthropic correct stop_reason mappings

17bee13

fix(tests): ai tests move to compare whole object

292df8e

fix(ai): AG-405: add mappings for gemini structured output

0c7a96c

fix(ai): AG-405 add anthropic structured output support

bb419c0

fix(ai): AG-405 add bedrock structured output support

8110101

fix(ai): mistral unknown field seed error

616a41c

tests(e2e): add tests for structured output

a22f671

chore(llm): add error handling and tests for iso_8601_to_epoch

8099159

fix(ai-proxy): map OpenAI chat completion's tool_choice when using An…

ad4a9ad

…thropic provider (#13355) AG-391 --------- Signed-off-by: Zexuan Luo <[email protected]>

feat(gemini): add vertex ai model garden model support (#13734)

d9119d9

AG-415

fix(llm): fix content_type judgment typo

4371897

fix(ai-proxy): model field was missing in OpenAI format response fr…

a3edddc

…om Gemini provider in some situations Signed-off-by: Zexuan Luo <[email protected]>

fix(ai-proxy): model field was incorrect in Gemini responses with v…

cb83bc1

…ar used as model name

fix(ai-proxy): empty array in structured output was encoded as empty…

49a7ee4

… object AG-457

fix(ai): bedrock structured_output streaming responses

1cb319c

fix(ai): bedrock not supporting multiple toolUse in one turn

18c688f

fix(ai): refactor gemini transformer to properly handle function call…

c5e32ab

…ing in llm/v1/chat

fix(ai): anthropic streaming tool_use responses

787f5ef

fix(ai): anthropic streaming tool_use responses

abebe4d

fix(ai): truncation of anthropic streaming responses when in mid-struct

cdbe47d

feat(huggingface): add support for new serverless inference provider …

bc94412

…and happy path for e2e test (#13679) AG-412

fix(ai-proxy): fix huggingface embedding inproperly parsed and add e2…

6e61d75

…e test (#14153) AG-489

fix(ai-proxy): fix huggingface inference provider does not support us…

312b61e

…e_cache and wait_for_model options [aigw-only] (#14287) AG-506

fffonion force-pushed the aigw-393 branch from 95d2fa3 to 312b61e Compare October 30, 2025 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(*): culmulative AI GW fixes 20251028 #14785

chore(*): culmulative AI GW fixes 20251028 #14785

fffonion commented Oct 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

chore(*): culmulative AI GW fixes 20251028 #14785

Are you sure you want to change the base?

chore(*): culmulative AI GW fixes 20251028 #14785

Conversation

fffonion commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Issue reference

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

fffonion commented Oct 28, 2025 •

edited

Loading