-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
feat: Gemini Image Generation Tool (Nano Banana) #10676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
* Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override. * Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths.
- Bump @google/genai package version to ^1.19.0 for improved functionality. - Refactor GeminiImageGen to createGeminiImageTool for better clarity and consistency. - Enhance manifest.json for Gemini Image Tools with updated descriptions and icon. - Add SVG icon for Gemini Image Tools. - Implement progress tracking for Gemini image generation in the UI. - Introduce new toolkit and context handling for image generation tools. This update improves the Gemini image generation capabilities and user experience.
…icon - Deleted the obsolete PNG file for Gemini image generation. - Updated the SVG icon with a new design featuring a gradient and shadow effect, enhancing visual appeal and consistency.
|
@danny-avila Corresponding Docs PR LibreChat-AI/librechat.ai#452 |
|
shouldn't it also work natively? |
|
nvm, that should invoke tools too lmao |
I was thinking about that but this would be a departure from how the project handles image tools. I organized it similar to the openai tools so the workflows stay the same for users |
|
@danny-avila with native multimodal image generation models appearing, it would be great to implement this functionality actually!
|
|
This is great that it's a tool - it can be called by other models. But yes, there's more models that can natively return text AND images (and audio?), so that would be good if it can handle that too. |


Summary
Adds a new image generation tool integrating Google's Gemini Image Models with support for both text-to-image generation and image context-aware editing.
Key Features:
Configuration:
\\env
Option 1: Gemini API (recommended for most users)
GEMINI_API_KEY=your-api-key
Option 2: Vertex AI
GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json
GOOGLE_CLOUD_LOCATION=us-central1
Optional: Change model (default: gemini-2.5-flash-image)
GEMINI_IMAGE_MODEL=gemini-3-pro-image-preview
\\
Builds upon and addresses feedback from #9538
cc @devilb2103 @danny-avila
Change Type
Testing
Tested locally with both Gemini API and Vertex AI configurations:
Test Configuration:
Checklist