-
Notifications
You must be signed in to change notification settings - Fork 47
Add Nemotron nano v2 vl #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cuichenx
wants to merge
89
commits into
main
Choose a base branch
from
chcui/nemotron-nano-v2-vl
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add Nemotron nano v2 vl #1136
Changes from 75 commits
Commits
Show all changes
89 commits
Select commit
Hold shift + click to select a range
e63ed61
add wip code
cuichenx 7858117
update utils for transformers config in hydra
yaoyu-33 457bace
temp save
yaoyu-33 22233a2
pipeclean conversion (forward wip)
cuichenx 6937da4
Merge branch 'refs/heads/main' into qwen-25vl-training
yaoyu-33 c67f734
vlm generate script updates for nemotron vl
cuichenx fcca45c
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx 790cd8d
fix after merging with main
cuichenx 3a9ab4f
clean up
cuichenx e0fc7d1
fix forward pass
cuichenx 44faee0
add /no_think sys prompt
cuichenx 8a51440
Merge branch 'refs/heads/main' into qwen-25vl-training
yaoyu-33 3bc6ba5
lint
yaoyu-33 8061e0f
revert qwen-vl changes in gpt
yaoyu-33 df4755a
revert qwen-vl changes in gpt #2
yaoyu-33 975efd2
Add mock dataset provider for qwen25 vl
yaoyu-33 be708c2
add qwen25 vl dataset support from auto
yaoyu-33 6822d34
lint
yaoyu-33 ec9c7cd
enable multi image and video inputs
cuichenx bc8c605
update _attn_implementation
yaoyu-33 689f491
update comments
yaoyu-33 cf2c769
Merge branch 'chcui/nemotron-nano-v2-vl' into 'dev/nemotron-nano-v2-vl'
cuichenx 4f0e90f
add preloaded dataset provider
yaoyu-33 4959ea5
enable hf export (need to manually copy over modeling files)
cuichenx 98caa7a
expose strict
cuichenx 2af0c2e
update _processor to a private attr
yaoyu-33 4a3ef3b
Merge branch 'chcui/hf_export' into 'dev/nemotron-nano-v2-vl'
cuichenx 7f3818e
Merge branch 'refs/heads/main' into chcui/nano-v2-vl-training
cuichenx ccf6abe
update qwen training utils
yaoyu-33 94c6192
training bug fix
yaoyu-33 95d3002
fix finalize grad
yaoyu-33 4b7ef60
save qwen25 vl recipes
yaoyu-33 c37ffa0
training WIP
cuichenx 03e3a7c
undo ckpt modification, loading works
cuichenx b095aae
Merge branch 'chcui/nano-v2-vl-training' into 'dev/nemotron-nano-v2-vl'
cuichenx 608117e
add padding logic for pp
yaoyu-33 a9f0e15
vlm step general
yaoyu-33 6ddd4b3
default update
yaoyu-33 f30aa39
Merge branch 'main' into qwen-25vl-training
yaoyu-33 e425113
update to model specific visual inputs, also update mock dataset to b…
yaoyu-33 5bc1f29
Merge branch 'main' into qwen-25vl-training
yaoyu-33 90a0ff0
add ci tests
yaoyu-33 49759bc
lint
yaoyu-33 62ffa88
update dependency
yaoyu-33 6af4e4c
build: add qwen-vl-utils and update lockfile
yaoyu-33 7e0ceaf
remove `start_of_response_token` use
yaoyu-33 a7e5fdc
add few more unit tests
yaoyu-33 1e44b97
fix wandb reinit issue
yaoyu-33 18012cd
Revert "fix wandb reinit issue"
yaoyu-33 b0b910e
lint
yaoyu-33 d2031ca
update and fix tests for vlm dataset
yaoyu-33 3d8f4b3
Merge remote-tracking branch 'origin/qwen-25vl-training' into chcui/n…
cuichenx 70aafe2
training works
cuichenx 398a812
add raven and llava-video datasets
cuichenx a44d26c
push discussion code
cuichenx cbc25d4
Merge branch 'chcui/nano-v2-vl-training' into 'dev/nemotron-nano-v2-vl'
cuichenx 56f9ad9
support video training
liding-nv a8ad5fd
add peft merge
cuichenx 46cd9b9
change wording
cuichenx 6008b3e
save every 200
cuichenx 2da5696
clean up internal paths
cuichenx d3dd155
add merge lora script..
cuichenx 3a13a6c
fix import
liding-nv b9da6cf
support multi subset video
liding-nv 0bcfcb8
export with copy
cuichenx e9ee70d
qa fixes
cuichenx 546c233
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx e69586d
clean up code
cuichenx 85c6a44
Merge remote-tracking branch 'origin/main' into chcui/nemotron-nano-v…
cuichenx d31d50f
Merge remote-tracking branch 'origin/main' into chcui/nemotron-nano-v…
cuichenx 2e223e8
change to supported HF architectures
cuichenx 1eb8fa3
add tests
cuichenx 6f739cf
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx 0abb526
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx 0567e20
address comments
cuichenx edc2d98
copy over py and json files only
cuichenx 9e80f35
merge causal lm and vlm so that output saves preprocessor config auto…
cuichenx bd447ae
move nemotron vlm generation to a new script
cuichenx bac193a
address comment
cuichenx c0756ce
move path helper to common utils
cuichenx 707562a
Merge branch 'main' into chcui/nemotron-nano-v2-vl
cuichenx f7e0d3b
update model name
cuichenx b6a60d7
Merge branch 'chcui/nemotron-nano-v2-vl' of github.com:NVIDIA-NeMo/Me…
cuichenx bfda67e
refactor to llava_step
cuichenx 71b4e78
clean up
cuichenx 8813087
Merge branch 'main' into chcui/nemotron-nano-v2-vl
cuichenx e67e9f1
revert previous export copy code
cuichenx ced4190
raise error if trying to access validation split for raven and llava …
cuichenx f603601
Fix typo
cuichenx File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.