feature: Nemo compatibility with Pixtral model using images with text as message

### Did you check the docs?

- [x] I have read all the NeMo-Guardrails docs

### Is your feature request related to a problem? Please describe.

Hi ,

I'm encountering an issue with the Pixtral model in the context of multimodal input support via the LangChain + NeMo Guardrails setup using VisionRails.

I have a working integration where I'm sending a chat completion payload that includes both base64-encoded image data and text input as part of the same message. This input format works perfectly with gpt-4o when called through my LLM Gateway using the ChatCompletions-compatible API.

potentially_unsafe_message = [{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "describe the image?",
    },
    {
            "type": "image",
            "source_type": "base64",
            "data": base64_image,
            "mime_type": "image/jpeg",
    }
  ],


On checking the logs we checked that it filters the image base64 from the payload before passing it to the payload for Nemo using ChatCompletions.
The response given is : Sorry I dont know which image you are talking about .

vision_model = ChatOpenAI(api_key="None",
                  base_url=LLM_GW_ENDPOINT,
                  model='Pixtral',
                   default_headers=headers   ,                           
                   streaming= False           
                  )


# Load configuration
config = RailsConfig.from_path("./config/")
# Load configuration
rails = LLMRails(config,llm=vision_model,verbose=False)

Is there an alternative payload format or preprocessing step required to use Pixtral with Nemo?

Thanks in advance for your help!

### Describe the solution you'd like

Is there an alternative payload format or preprocessing step required to use Pixtral with Nemo using images along with text?

### Describe alternatives you've considered

What I’ve verified:

The payload format is valid and works with GPT-4o.The base64-encoded image is a standard JPEG, loaded correctly from file.Switching only the model name from gpt-4o to pixtral causes the issue.

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature: Nemo compatibility with Pixtral model using images with text as message #1229

Did you check the docs?

Is your feature request related to a problem? Please describe.

Load configuration

Load configuration

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feature: Nemo compatibility with Pixtral model using images with text as message #1229

Description

Did you check the docs?

Is your feature request related to a problem? Please describe.

Load configuration

Load configuration

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions