-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add initial qwen2.5-vl model and test #2971
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
10aa62f
feat: support qwen2.5 vl model
drbh 1f58577
fix: bump support models doc
drbh 76d526d
feat: check before rope type adjustment and small refactors
drbh 07c0080
fix: add transformer overlay for processor support
drbh e4e6ea2
fix: vendor processor and config from transformers
drbh 05333b7
fix: refactor/simplify conditionals
drbh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
26 changes: 26 additions & 0 deletions
26
integration-tests/models/__snapshots__/test_flash_qwen2_5_vl/test_flash_qwen2_5_vl_bay.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
{ | ||
"choices": [ | ||
{ | ||
"finish_reason": "stop", | ||
"index": 0, | ||
"logprobs": null, | ||
"message": { | ||
"content": "The image showcases the Statue of Liberty, a colossal bronze statue located in New York Harbor, a heritage building in the United States. The statue has a majestic presence, with one arm raised towards the sun and the other hitched on her hip. It sits atop a keeper's walkway, observed from the water. Surrounding the statue is a lush green meadow, where picnic spots, walkways, and a visitor desk can be found. In front of the statue, a large marina can accommodate fourteen different kinds of boats. In the backdrop stands the Empire State Building, marking the crowded skyscrapers of New York City.", | ||
"name": null, | ||
"role": "assistant", | ||
"tool_calls": null | ||
}, | ||
"usage": null | ||
} | ||
], | ||
"created": 1738342753, | ||
"id": "", | ||
"model": "Qwen/Qwen2.5-VL-3B-Instruct", | ||
"object": "chat.completion", | ||
"system_fingerprint": "3.0.2-dev0-native", | ||
"usage": { | ||
"completion_tokens": 128, | ||
"prompt_tokens": 8736, | ||
"total_tokens": 8864 | ||
} | ||
} |
26 changes: 26 additions & 0 deletions
26
...ation-tests/models/__snapshots__/test_flash_qwen2_5_vl/test_flash_qwen2_5_vl_inpaint.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
{ | ||
"choices": [ | ||
{ | ||
"finish_reason": "stop", | ||
"index": 0, | ||
"logprobs": null, | ||
"message": { | ||
"content": "The image shows a whimsical scene set in what appears to be a fast-food restaurant. Dominating the foreground is a large, green, inflatable dinosaur with realistic textures, giving it a Jurassic Park-like appearance. The dinosaur is wearing a red Adult Swim logo hat, adding a humorous touch to its appearance.\n\nSurrounding the dinosaur are various food items typically found in a fast-food restaurant, including French fries in a plastic cup, a hamburger on a plate, and a beverage in another cup. The hamburger is detailed with lettuce, tomato, and other typical fast-food ingredients.\n\nAccompanying the dinosaur is a realistic-looking owl perched on the table, which adds to the surreal and playful atmosphere of the scene. The background features the interior of the restaurant with neon signs and other typical decor elements, enhancing the overall theme of a fun and fantastical fast-food experience.\n\nOverall, the image is a playful and imaginative blend of a standard fast-food setting with an unexpected and amusing twist provided by the dinosaur and owl characters.", | ||
"name": null, | ||
"role": "assistant", | ||
"tool_calls": null | ||
}, | ||
"usage": null | ||
} | ||
], | ||
"created": 1738343775, | ||
"id": "", | ||
"model": "Qwen/Qwen2.5-VL-3B-Instruct", | ||
"object": "chat.completion", | ||
"system_fingerprint": "3.0.2-dev0-native", | ||
"usage": { | ||
"completion_tokens": 206, | ||
"prompt_tokens": 5375, | ||
"total_tokens": 5581 | ||
} | ||
} |
26 changes: 26 additions & 0 deletions
26
...ration-tests/models/__snapshots__/test_flash_qwen2_5_vl/test_flash_qwen2_5_vl_simple.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
{ | ||
"choices": [ | ||
{ | ||
"finish_reason": "stop", | ||
"index": 0, | ||
"logprobs": null, | ||
"message": { | ||
"content": "The image depicts an anthropomorphic rabbit character wearing an intricate space suit, which includes a helmet with a starry face pattern and multiple suitors. The rabbit's ears are significantly large and upright, and it has a hitchhiker-like star antennas on its chest. The background is a reddish-orange, rocky landscape, suggesting a Martian environment. The suit has various buttons, a red button on the chest, and a reflective or illuminated dome on the head. The overall color scheme is dominated by shades of red, orange, and gray, giving a sense of a rugged, otherworldly setting.", | ||
"name": null, | ||
"role": "assistant", | ||
"tool_calls": null | ||
}, | ||
"usage": null | ||
} | ||
], | ||
"created": 1738342872, | ||
"id": "", | ||
"model": "Qwen/Qwen2.5-VL-3B-Instruct", | ||
"object": "chat.completion", | ||
"system_fingerprint": "3.0.2-dev0-native", | ||
"usage": { | ||
"completion_tokens": 121, | ||
"prompt_tokens": 1363, | ||
"total_tokens": 1484 | ||
} | ||
} |
20 changes: 20 additions & 0 deletions
20
...ts/models/__snapshots__/test_flash_qwen2_5_vl/test_flash_qwen2_5_vl_simple_streaming.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
{ | ||
"choices": [ | ||
{ | ||
"delta": { | ||
"content": "", | ||
"role": "assistant", | ||
"tool_calls": null | ||
}, | ||
"finish_reason": "stop", | ||
"index": 0, | ||
"logprobs": null | ||
} | ||
], | ||
"created": 1738343559, | ||
"id": "", | ||
"model": "Qwen/Qwen2.5-VL-3B-Instruct", | ||
"object": "chat.completion.chunk", | ||
"system_fingerprint": "3.0.2-dev0-native", | ||
"usage": null | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import pytest | ||
|
||
|
||
@pytest.fixture(scope="module") | ||
def flash_qwen2_5_vl_handle(launcher): | ||
with launcher("Qwen/Qwen2.5-VL-3B-Instruct") as handle: | ||
yield handle | ||
|
||
|
||
@pytest.fixture(scope="module") | ||
async def flash_qwen2_5(flash_qwen2_5_vl_handle): | ||
await flash_qwen2_5_vl_handle.health(300) | ||
return flash_qwen2_5_vl_handle.client | ||
|
||
|
||
@pytest.mark.private | ||
async def test_flash_qwen2_5_vl_simple(flash_qwen2_5, response_snapshot): | ||
response = await flash_qwen2_5.chat( | ||
seed=42, | ||
messages=[ | ||
{ | ||
"role": "user", | ||
"content": [ | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png" | ||
}, | ||
}, | ||
{"type": "text", "text": "Describe the image"}, | ||
], | ||
}, | ||
], | ||
) | ||
|
||
assert ( | ||
response.choices[0].message.content | ||
== "The image depicts an anthropomorphic rabbit character wearing an intricate space suit, which includes a helmet with a starry face pattern and multiple suitors. The rabbit's ears are significantly large and upright, and it has a hitchhiker-like star antennas on its chest. The background is a reddish-orange, rocky landscape, suggesting a Martian environment. The suit has various buttons, a red button on the chest, and a reflective or illuminated dome on the head. The overall color scheme is dominated by shades of red, orange, and gray, giving a sense of a rugged, otherworldly setting." | ||
) | ||
|
||
assert response == response_snapshot | ||
|
||
|
||
@pytest.mark.private | ||
async def test_flash_qwen2_5_vl_simple_streaming(flash_qwen2_5, response_snapshot): | ||
responses = await flash_qwen2_5.chat( | ||
seed=42, | ||
messages=[ | ||
{ | ||
"role": "user", | ||
"content": [ | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png" | ||
}, | ||
}, | ||
{"type": "text", "text": "Describe the image"}, | ||
], | ||
}, | ||
], | ||
stream=True, | ||
) | ||
|
||
count = 0 | ||
generated = "" | ||
last_response = None | ||
async for response in responses: | ||
count += 1 | ||
generated += response.choices[0].delta.content | ||
last_response = response | ||
|
||
assert ( | ||
generated | ||
== "The image depicts an anthropomorphic rabbit character wearing an intricate space suit, which includes a helmet with a starry face pattern and multiple suitors. The rabbit's ears are significantly large and upright, and it has a hitchhiker-like star antennas on its chest. The background is a reddish-orange, rocky landscape, suggesting a Martian environment. The suit has various buttons, a red button on the chest, and a reflective or illuminated dome on the head. The overall color scheme is dominated by shades of red, orange, and gray, giving a sense of a rugged, otherworldly setting." | ||
) | ||
assert count == 121 | ||
assert last_response == response_snapshot | ||
|
||
|
||
@pytest.mark.private | ||
async def test_flash_qwen2_5_vl_bay(flash_qwen2_5, response_snapshot): | ||
response = await flash_qwen2_5.chat( | ||
seed=42, | ||
messages=[ | ||
{ | ||
"role": "user", | ||
"content": [ | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" | ||
}, | ||
}, | ||
{"type": "text", "text": "Describe the image"}, | ||
], | ||
}, | ||
], | ||
) | ||
assert response == response_snapshot | ||
|
||
|
||
@pytest.mark.private | ||
async def test_flash_qwen2_5_vl_inpaint(flash_qwen2_5, response_snapshot): | ||
response = await flash_qwen2_5.chat( | ||
seed=42, | ||
messages=[ | ||
{ | ||
"role": "user", | ||
"content": [ | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/autopipeline-inpaint.png" | ||
}, | ||
}, | ||
{"type": "text", "text": "Describe the image"}, | ||
], | ||
}, | ||
], | ||
) | ||
assert response == response_snapshot |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need that with 4.49 release ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be able to be removed with 4.49