-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
✨ feat: Added Support for Flagship Gemini Image (🍌 Nano Banana) models for Image generation and editing #9538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
falkenbt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice, I got this to work on my machine. There is just a small issue with the credentials path, see inline comment.
* Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override. * Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths.
|
@falkenbt The issues are addressed. Thank you for the comments |
|
It might only be my local configuration, but for me it says the image is generated, but it then does not show. I also can't find it anywhere on the local disk (using local storage strategy) Do you have good pointers how I could further debug? Thanks a lot for contributing this btw. - eagerly awaiting it :) |
|
A future related feature request after this ships: Nano Banana is supported by OpenRouter and it would be great to support Nano-Banana via OpenRouter in addition to Vertex API. |
Could you please share the logs that appear in the librechat container when you try to generate an image? |
|
@devilb2103 thanks for that trivial hint - sorry I hadn't checked that 🤦 |
falkenbt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me now, thanks a lot. I cannot approve though, I'm just a user without elevated access.
|
@danny-avila any chance this could make it into the 0.80 release? This is really useful for a lot of people. |
|
hey @danny-avila do let me know what can be improved here so that this contribution can be merged without much efforts from the team 😄 |
|
Why is Vertex AI API used here? This also works with Gemini API curl -s -X POST
"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme"}
]
}]
}' \
| grep -o '"data": "[^"]*"' \
| cut -d'"' -f4 \
| base64 --decode > gemini-native-image.png |
It should be fairly easy to have another librechat manifest var to switch between the already supported VertexAI option with GeminiAPI This implementation makes it convenient for those who have access to Googles VertexAI platform |
|
@danny-avila is this on your radar? It would be great to have an indication when this could be merged and released, from the reactions here and my surroundings a lot of people are eagerly awaiting this. Thanks! |
|
We Would love to see this in the next release. i have a ton of users calling for it. tried making a action for this usinga. OpeAPI spec and it fails as it passes back base64 in a format that it cant decode. so this woudl be great as a tool. |
|
@danny-avila hate to annoy you but once again asking if you could accept this PR into main plz since there is huge demand for nano banana in our enterprise |
|
@danny-avila Just wanted to reiterate the importance of this feature. Adding native support for Gemini 2.5 is a major value-add for the platform. The PR is stable, tested, and ready to go. Merging this now would greatly benefit the user base who are waiting to utilize these image generation tools. Hoping to see this live soon! |
|
It must be discouraging for contributors to submit pull requests and then be ignored. |
@lukaswelte Could you please tell the exact problem with IAM Permissions? I have Vertex AI Admin role and I'm still experiencing the same problem. @devilb2103 Thank you for the great job! Just like to ask if this normal or not: warn: [GeminiImageGen] Missing required parameters for storage, falling back to data URL |
|
@RepLicanT-UHD in my case I literally didn't have the role on the service account that I gave to librechat. Are you sure it picks up the service-account.json in your case? |
Thanks for your response! Yes it does. When I select any model other than 2.5 Flash for the Agent to execute requests via the 'gemini_image_gen' tool, the errors disappear (surprisingly, o4-mini handles this role particularly well). The only warnings remaining are warn: [GeminiImageGen] Missing required parameters for storage, falling back to data URL and warn: [GeminiImageGen] Could not save to storage, using data URL. Meanwhile, my File Strategy is set to local, and images are displayed and edited correctly. However, as soon as I switch back to 2.5 Flash with any combination of reasoning and input/output tokens, it claims inside the thought process that the image was generated and shown to the user, even though that's not the case. |
|
Does it support Gemini 3 Pro Image (aka Nano Banana Pro) already? Would be awesome for admins to choose in env or librechat.yaml the model or have a dropdown in the tool UI: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/3-pro-image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new image generation tool that integrates Google's Gemini 2.5 Flash Image Model with Vertex AI, supporting both text-to-image generation and image context-aware editing. The implementation follows existing LibreChat patterns for image generation tools and includes comprehensive safety filtering and multi-storage strategy support.
Key Changes
- Added Gemini image generation tool with support for text prompts and image context editing
- Integrated with existing LibreChat file storage strategies (local, S3, Azure, Firebase)
- Implemented safety filtering with user-friendly error messages for content policy violations
Reviewed changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| api/app/clients/tools/structured/GeminiImageGen.js | Core implementation of Gemini image generation tool with image context support and safety filtering |
| api/package.json | Added @google/genai dependency for Vertex AI integration |
| package-lock.json | Locked dependency versions including @google/[email protected] and unrelated dicebear changes |
| api/app/clients/tools/index.js | Exported GeminiImageGen tool for registration |
| api/app/clients/tools/manifest.json | Added Gemini Image Tools manifest entry |
| api/app/clients/tools/util/handleTools.js | Registered tool constructor with custom initialization logic and unrelated serpapi configuration |
| api/server/services/ToolService.js | Added commented-out import (cleanup needed) |
| packages/data-provider/src/config.ts | Added gemini_image_gen to image generation tools set |
| api/app/clients/tools/structured/README-GeminiNanoBanana.md | Comprehensive documentation for setup and usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
|
I understand the frustration with review delays. As maintainers, we need to balance responsiveness with ensuring thorough reviews, since incomplete features can create significant long-term maintenance burden. I'm reviewing this PR now and will provide detailed feedback shortly. To start, please see the GitHub Copilot reviews, which are actually worth considering.
This is a "must implement" for merging and corresponding documentation is required as well in https://github.com/LibreChat-AI/librechat.ai |
|
@danny-avila Thanks for the comments ✌️. |
I can take a shot at it this week if you add me |
|
@devilb2103 @danny-avila |
…n with SVG for Gemini Image Tools - Updated the @google/genai dependency in package-lock.json and package.json to version 1.19.0. - Enhanced the Gemini Image Tools description and changed the icon from PNG to SVG for better scalability and design. - Refactored GeminiImageGen.js to support provider detection and model ID configuration for improved flexibility in image generation. - Removed deprecated references and cleaned up the handleTools.js file by eliminating unnecessary parameters.
- Added a new utility function `createImageToolContext` to streamline the creation of context strings for image generation and editing tools. - Refactored `handleTools.js` to utilize the new function, improving code readability and maintainability. - Created a new file `imageContext.ts` to define the interface and implementation for the image tool context functionality. - Updated the index file to export the new image context utilities.
…e related files - Migrated `GeminiImageGen` tool implementationfrom CommonJS to TS. - Updated the `index.js` `GeminiImageGen` references. - Enhanced the README documentation for the Gemini Image Generation Tool, reflecting the new structure and features. - Adjusted the `handleTools.js` and `tools.js` files to ensure proper integration of remaining tools and maintain functionality. - Added constants and types for the Gemini tool to improve code organization and clarity.
- Added support for the `gemini_image_gen` tool in the message rendering logic. - Updated progress text handling for `gemini_image_gen` to provide detailed status messages during image creation. - Enhanced user feedback with localized messages reflecting the progress of image generation.
- Deleted the README-GeminiNanoBanana.md file as it is no longer relevant to the current implementation of the Gemini image generation tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 15 out of 17 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
…ageGen - Enhanced auto-detection logic to prefer Vertex AI only if the GOOGLE_SERVICE_KEY_FILE exists and is valid. - Updated the credentials path to use the current working directory for the default auth.json file. - Added error handling for malformed JSON in the Google service account credentials file, providing clearer error messages.
- Updated the method for retrieving the credentials file path to streamline the logic and remove unnecessary environment variable checks. - Adjusted the default path for the auth.json file to reflect the current working directory more accurately.
e4b98bd to
2901bea
Compare
Added Gemini Flash Image Generation Tool (Nano Banana + Nano Banana Pro)
trim.6A4F5154-48D1-4547-ABEF-C3A188733218.MOV
Summary
This PR (addresses #9283 and #6065) implements a new image generation tool using Google's Gemini 2.5 Flash Image Model with Vertex AI integration. The tool supports both text-only image generation and image context-aware generation for editing/modification requests. It includes handling of rejected tool calls due to content policy violoations, multiple file storage strategies, and seamless integration with LibreChat's existing agent tool ecosystem.
Key Features:
Configuration Requirements:
api/data/auth.json./api/data/auth.json:/app/api/data/auth.jsonChange Type
Testing
The feature has been tested for:
Test Configuration:
@google/genai: ^1.17.0added toapi/package.jsonTest Steps:
gemini_image_gentool in an agentChecklist
Files Modified
Core Implementation
api/app/clients/tools/structured/GeminiImageGen.js- Main tool implementationapi/package.json- Added@google/genaidependencypackages/data-provider/src/config.ts- Added toimageGenToolssetIntegration
api/app/clients/tools/index.js- Tool exportapi/app/clients/tools/manifest.json- Tool manifest entryapi/app/clients/tools/util/handleTools.js- Tool registration and configurationapi/server/services/ToolService.js- Import statementDocumentation
api/app/clients/tools/structured/README-GeminiNanoBanana.md- Implementation guideBreaking Changes
None. This is a purely additive feature that doesn't modify existing functionality.
Configuration Notes
Required Docker Configuration:
LibreChat YAML Configuration:
Add
gemini_image_gento theincludedToolsarray in yourlibrechat.yaml:Environment Variables
Add these to your
.envfile to configure the Gemini Image Generation tool:GEMINI_IMAGE_PROVIDERvertex(service account) orgemini(API key)GEMINI_IMAGE_MODELgemini-2.5-flash-image-previewGEMINI_API_KEYgeminiprovider)GOOGLE_SERVICE_KEY_FILEapi/data/auth.jsonGOOGLE_CLOUD_LOCATIONglobalGOOGLE_APPLICATION_CREDENTIALSNote: If
GEMINI_IMAGE_PROVIDERis not set, the tool auto-detects:Note: Service Account Credentials must be manually set in .env in the following manner:
Google Cloud Setup:
api/data/auth.json