
-
Chinese Academy of Sciences
- China
- https://ruoyuchen10.github.io/
- @RuoyuChen20
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
CVPR 2025: Frequency Dynamic Convolution for Dense Image Prediction
The official repo for "Where do Large Vision-Language Models Look at when Answering Questions?"
Janus-Series: Unified Multimodal Understanding and Generation Models
A post-training method to enhance CLIP's fine-grained visual representations with generative models.
Object Recognition as Next Token Prediction (CVPR 2024 Highlight)
official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"
(CVPR 2025) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"
Fully open reproduction of DeepSeek-R1
Safety at Scale: A Comprehensive Survey of Large Model Safety
The code of "Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds" CVPR 2024
[CVPR 2025] Interpreting Object-level Foundation Models via Visual Precision Search
Integrate the DeepSeek API into popular softwares
A bias and fairness examination of multimodal LLMs
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
GPT4 based personalized ArXiv paper assistant bot
This repository is for the first survey on SAM for videos.
Code for Event-Aware Video Deraining via Multi-Patch Progressive Learning
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
[NeurIPS 2024] EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models
[NAACL 2025 Oral] 🎉 From redundancy to relevance: Enhancing explainability in multimodal large language models