Skip to content
View RuoyuChen10's full-sized avatar
:octocat:
Overcome yourself in this era
:octocat:
Overcome yourself in this era

Block or report RuoyuChen10

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

CVPR 2025: Frequency Dynamic Convolution for Dense Image Prediction

25 2 Updated Mar 25, 2025

The official repo for "Where do Large Vision-Language Models Look at when Answering Questions?"

Python 28 Updated Mar 24, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,924 2,212 Updated Feb 1, 2025

A post-training method to enhance CLIP's fine-grained visual representations with generative models.

Python 9 Updated Mar 27, 2025

Object Recognition as Next Token Prediction (CVPR 2024 Highlight)

Python 175 7 Updated Dec 24, 2024

official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"

Python 14 Updated Dec 22, 2024

Mamba SSM architecture

Python 14,412 1,259 Updated Jan 18, 2025

O1 Replication Journey

1,980 65 Updated Jan 14, 2025

(CVPR 2025) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"

Python 111 5 Updated Mar 26, 2025

Fully open reproduction of DeepSeek-R1

Python 23,444 2,132 Updated Mar 28, 2025

minimal-cost for training 0.5B R1-Zero

Python 673 86 Updated Mar 28, 2025

Safety at Scale: A Comprehensive Survey of Large Model Safety

128 3 Updated Feb 19, 2025

The code of "Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds" CVPR 2024

Python 29 2 Updated Mar 23, 2024

[CVPR 2025] Interpreting Object-level Foundation Models via Visual Precision Search

Jupyter Notebook 16 Updated Mar 21, 2025

Integrate the DeepSeek API into popular softwares

30,529 3,311 Updated Mar 28, 2025

A bias and fairness examination of multimodal LLMs

Python 1 Updated Jan 23, 2025

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Python 423 40 Updated Feb 1, 2024

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,072 254 Updated Mar 25, 2025

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

Python 4,074 589 Updated Mar 27, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 7,170 453 Updated Mar 22, 2025

GPT4 based personalized ArXiv paper assistant bot

Python 514 134 Updated Mar 26, 2024

This repository is for the first survey on SAM for videos.

36 4 Updated Mar 9, 2025
Python 34 3 Updated Sep 11, 2024

Code for Event-Aware Video Deraining via Multi-Patch Progressive Learning

Python 13 Updated Jul 17, 2024

[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation

Jupyter Notebook 362 20 Updated Oct 9, 2024

[NeurIPS 2024] EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models

Python 11 2 Updated Nov 1, 2024

[NAACL 2025 Oral] 🎉 From redundancy to relevance: Enhancing explainability in multimodal large language models

Python 89 6 Updated Feb 13, 2025
Python 48 2 Updated Nov 5, 2024
Next
Showing results