You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: add multimodal sequence parallelism support
* refactor: separate SFT/DPO training scripts and optimize data handling
- Remove data printing functions from SFT and DPO trainers for better performance
- Replace 360-example-vl.sh with separate SFT and DPO training scripts
- Add SFT visual-language demo dataset (data/sft-vl-demo/)
- Update dataset configuration to support new data structure
* refactor: restructure multimodal model forward functions and optimize code style
- Add multimodal_forwards module to centrally manage multimodal model forward logic
- Extract and optimize forward function implementations for Qwen2 VL and Qwen2.5 VL
- Improve sequence_parallel related code structure
* feat(vl): update VL training scripts and clean up demo data
* refactor: improve readability of sequence parallel attention check
---------
Co-authored-by: lilin3 <lilin3@360.cn>
0 commit comments