-
Notifications
You must be signed in to change notification settings - Fork 47
Add Nemotron nano v2 vl #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Nemotron nano v2 vl #1136
Conversation
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
…motron-nano-v2-vl
# Conflicts: # src/megatron/bridge/training/config.py
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
model Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Nemotron Nano V2 VL bridge and provider See merge request chcui/Megatron-Bridge!1
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
HF export See merge request chcui/Megatron-Bridge!2
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
…matically Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
…gatron-Bridge into chcui/nemotron-nano-v2-vl
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
|
blocked by mcore version bump after NVIDIA/Megatron-LM#2115 is merged. |
…video Signed-off-by: Chen Cui <[email protected]>
|
/ok to test ced4190 |
Signed-off-by: Chen Cui <[email protected]>
|
Hey @cuichenx thank you for this message I’m trying out the 12B variant, but it would be great if you could also support the 8B VLM model. We have some work where we prefer models under 10B parameters. I’m also open to contributing — please let me know if there are any plans to support it. |
Hi @adithya-s-k, thanks for your interest. Currently NVIDIA has only released a 12B v2 VL model (nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16). We don't have any plans to support the 9B v2 VL model since the checkpoint is not released. Are you trying to fine-tune the 9B VL model from scratch? We would welcome your contribution of other VL models :) |
I’m referring to this model: nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1, which is the 8B variant we’ve been working with and really like. We were planning to fine-tune it but ran into several dependency conflicts. It would be great to have support for this model in Megatron-Bridge. If not, please let me know if it’s possible to add it manually. I can take a stab at it, though I understand it might be tricky if there have been underlying API changes. |
NVIDIA Nemotron Nano v2 VL is an open 12B multimodal reasoning model for document intelligence and video understanding. It enables AI assistants to extract, interpret, and act on information across text, images, tables, and videos. This makes the model valuable for agents focused on data analysis, document processing and visual understanding in applications like generating reports, curating videos, and dense captioning for media asset management, and retrieval-augmented search.
NeMo Megatron Bridge supports finetuning this model (including LoRA finetuning) on single-image, multi-image, and video datasets. The finetuned model can be converted back to the 🤗 Hugging Face format for downstream evaluation.
The model is currently available in the
nvcr.io/nvidia/nemo:25.09.nemotron_nano_v2_vlcontainer. This is the PR to the main branch.Documentation: https://docs.nvidia.com/nemo/megatron-bridge/latest/models/vlm/nemotron-nano-v2-vl.html
Notable differences compared to the code in the nvcr.io/nvidia/nemo:25.09.nemotron_nano_v2_vl container:
--use_llava_modelis removed (hard coded into the new script)Requires this megatron branch: NVIDIA/Megatron-LM#2115