feat: watsonx multimodal support #2213

ChinmayBansal · 2025-08-22T19:26:24Z

Related Issues

fixes Image support in WatsonxChatGenerator #2131

Proposed Changes:

This PR adds multimodal (image + text) support to WatsonxChatGenerator, enabling the component to process
both text and images in chat messages. The implementation follows established patterns from the
AnthropicChatGenerator and LlamaCppChatGenerator multimodal support.

Key Features Added:

Image format validation for supported formats (JPEG, PNG)
Proper message conversion to Watson API format with base64 data URIs
Support for multimodal models like meta-llama/llama-3-2-11b-vision-instruct and pixtral-12b
Role-based image restrictions (images only allowed in user messages)
Comprehensive error handling for unsupported formats and edge cases
Pre-validation of images before processing for better error flow

Implementation Details:

Updated _prepare_api_call() method to handle multimodal content while preserving order
Added image format validation constants ImageFormat and IMAGE_SUPPORTED_FORMATS
Enhanced component docstring with detailed usage examples for multimodal scenarios
Added proper type annotations and removed type: ignore directive
Pre-validate all images upfront before content processing (following LlamaCpp pattern)

How did you test it?

Unit Tests:

✅ test_prepare_api_call_with_image() - Tests proper multimodal message conversion
✅ test_prepare_api_call_with_unsupported_mime_type() - Tests error handling for
unsupported formats
✅ test_prepare_api_call_with_none_mime_type() - Tests edge case with None mime type
✅ test_prepare_api_call_image_in_non_user_message() - Tests role-based restrictions
✅ test_multimodal_message_processing() - Tests end-to-end multimodal processing with mocked model
✅ test_supported_image_formats() - Tests all supported formats (JPEG, PNG)
✅ test_multiple_images_in_single_message() - Tests multiple image support

Integration Tests:

✅ test_live_run_multimodal() - Tests live API calls with real Watson multimodal models

Code Quality Verification:

✅ All linting checks pass: hatch run fmt
✅ All type checking passes: hatch run test:types
✅ All unit tests pass: hatch run test:unit

Manual Verification:

Tested multimodal message creation and conversion
Verified proper error messages for validation failures
Confirmed Watson API format compatibility with data URI structure

Notes for the reviewer

The implementation closely follows the patterns established in AnthropicChatGenerator and
LlamaCppChatGenerator
Image validation uses the same error message format as other integrations for consistency
The data:mime-type;base64,data format with image_url structure is required by Watson API for multimodal processing
Added comprehensive test coverage that matches and exceeds the patterns used in Anthropic and LlamaCpp tests
All edge cases are properly handled including None mime types and role restrictions
Watson supports fewer image formats (JPEG, PNG only) compared to Anthropic/LlamaCpp (which also support GIF, WebP)

Checklist

I have read the contributors
guidelines and the
code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for
my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.

...tions/watsonx/src/haystack_integrations/components/generators/watsonx/chat/chat_generator.py

sjrl · 2025-08-26T09:43:40Z

@ChinmayBansal thanks for your work on this!

One high-level comment:

Let's make sure to update the minimum version of haystack-ai in pyproject.toml to haystack-ai>=2.17.1 since ImageContent was only introduced in the latest release

integrations/watsonx/tests/test_chat_generator.py

…odal-support

ChinmayBansal · 2025-08-26T15:59:47Z

Hi @sjrl,

I have addressed your feedback. Could you review?

Thanks!

sjrl

Thanks!

ChinmayBansal added 2 commits August 22, 2025 11:05

feat: add multimodal support to LlamaCppChatGenerator

6f62093

feat: mulimodal support to WatsonxChatGenerator

311717d

ChinmayBansal requested a review from a team as a code owner August 22, 2025 19:26

ChinmayBansal requested review from sjrl and removed request for a team August 22, 2025 19:26

github-actions bot added integration:watsonx type:documentation Improvements or additions to documentation labels Aug 22, 2025

ChinmayBansal changed the title ~~Feat/watsonx multimodal support~~ feat:watsonx multimodal support Aug 22, 2025

ChinmayBansal changed the title ~~feat:watsonx multimodal support~~ feat: watsonx multimodal support Aug 22, 2025

Merge branch 'main' into feat/watsonx-multimodal-support

2ae8585

sjrl reviewed Aug 26, 2025

View reviewed changes

...tions/watsonx/src/haystack_integrations/components/generators/watsonx/chat/chat_generator.py Outdated Show resolved Hide resolved

sjrl reviewed Aug 26, 2025

View reviewed changes

...tions/watsonx/src/haystack_integrations/components/generators/watsonx/chat/chat_generator.py Outdated Show resolved Hide resolved

sjrl reviewed Aug 26, 2025

View reviewed changes

integrations/watsonx/tests/test_chat_generator.py Show resolved Hide resolved

ChinmayBansal added 2 commits August 26, 2025 08:55

addresss PR feedback

adeacbb

Merge remote-tracking branch 'upstream/main' into feat/watsonx-multim…

780f19e

…odal-support

sjrl approved these changes Aug 27, 2025

View reviewed changes

sjrl merged commit 2ef5c82 into deepset-ai:main Aug 27, 2025
11 checks passed

ChinmayBansal deleted the feat/watsonx-multimodal-support branch August 27, 2025 15:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: watsonx multimodal support #2213

feat: watsonx multimodal support #2213

Uh oh!

ChinmayBansal commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

sjrl commented Aug 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

ChinmayBansal commented Aug 26, 2025

Uh oh!

sjrl left a comment

Uh oh!

Uh oh!

Uh oh!

feat: watsonx multimodal support #2213

feat: watsonx multimodal support #2213

Uh oh!

Conversation

ChinmayBansal commented Aug 22, 2025

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

Uh oh!

Uh oh!

sjrl commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ChinmayBansal commented Aug 26, 2025

Uh oh!

sjrl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sjrl commented Aug 26, 2025 •

edited

Loading