Algorithm Engineer @ Alibaba | AI Researcher | Open Source Contributor
I'm passionate about building cutting-edge AI systems that serve real users at scale. Currently working on the Qwen app at Alibaba, where I focus on multimodal AI, diffusion models, and large-scale data engineering.
- Multimodal AI Systems: Building instant cross-modal retrieval engines over 6B+ images
- Diffusion Models: Contributing to image and video generation foundation models
- AI System Engineering: Model post-training, quantization, deployment, and optimization
- Large-Scale Data Processing: Data pipeline engineering, filtering, and re-balancing
Real-time streaming avatar generation with infinite length. Achieving 20 FPS with a 14B-parameter diffusion model through innovative pipeline parallelism.
An efficient image generation foundation model with single-stream diffusion transformer architecture.
Training commercial-level video generation models. Contributed to data processing pipeline and inference engine optimization.
Learning to draw from sequence data. Presented at SIGGRAPH Asia 2024.
Learning customized instructional image editor from few-shot examples. Published at ICCV 2025.
A mathematics encyclopedia built by undergraduate students, embracing the Agent era for mathematical education.
I have multiple exciting opportunities for passionate researchers:
- Digital Avatar Interns: Building the future of personal AI assistants with Live Avatar
- Multimodal Data Diagnostic Tool: Creating semantic retrieval systems for 100B+ data points
- [Open Source] Easymath-wiki Contributors: Transforming mathematical education with AI agents, not interns but collaborators
Interested? Check out my website for details or reach out at zhengjie.hsj@alibaba-inc.com
π‘ Fun Fact: When I'm not coding or researching, I enjoy strategic games, writing, and reading!




