Research
I'm interested in computer vision and generative AI: Diffusion, Agent for Video Generation.
|
|
Paper2Video: Automatic Video Generation from Scientific Papers
Zeyu Zhu*, Kevin Qinghong Lin* , Mike Zheng Shou
Scaling Environments for Agents Workshop at NeurIPS, 2025
arxiv /
page /
github /
datasets /
paper of the day on huggingface
We introduce Paper2Video, a benchmark and multi-agent framework (PaperTalker) that automates academic presentation video generation from papers, integrating slides, subtitles, speech, and talking-heads with evaluation metrics to ensure faithfulness and informativeness.
|
|
Multi-human Interactive Talking Datasets
Zeyu Zhu, Weijia Wu, Mike Zheng Shou
Arxiv, 2025
arxiv /
page /
github /
datasets
We introduces a multi-human talking dataset (MIT) and baseline model (CovOG) for generating realistic multi-human talking videos, addressing the limitations of single-speaker approaches.
|
|
MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu, Zeyu Zhu, Mike Zheng Shou
Arxiv, 2025
page /
arxiv
We propose MovieAgent, a multi-agent Chain-of-Thought framework that automates long-form movie generation from scripts, coordinating scene planning, cinematography, and character consistency to achieve coherent, faithful, and fully automated film production.
|
|
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu, Mingyu Liu, Zeyu Zhu, Xi Xia, Haoen Feng, Wen Wang, Kevin Qinghong Lin, Chunhua Shen, Mike Zheng Shou
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025
page /
arxiv
We introduce MovieBench, a hierarchical movie-level dataset featuring long, multi-scene videos with coherent narratives and consistent characters, designed to benchmark and advance research in long-form video generation.
|
|
Unsupervised Hyperspectral Pansharpening via Low-rank Diffusion Model
Xiangyu Rui, Xiangyong Cao, Li Pang, Zeyu Zhu, Zongsheng Lyu, DeyuMeng
Information Fusion(IF=18.6), 2024
github /
arxiv
We propose a low-rank diffusion model for hyperspectral pansharpening by leveraging the power of the pre-trained deep diffusion model and better generalization ability of Bayesian methods.
|
|
Probability-based Global Cross-modal Upsampling for Pansharpening
Zeyu Zhu, Xiangyong Cao, Man Zhou, Junhao Huang, Deyu Meng
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023
github /
arxiv
We propose a novel probability-based global cross-modal upsampling (PGCU) method for pan-sharpening to to exploit global information of the LRMS image as well as the cross-modal information of the PAN image which can be plug-and-played into existing models.
|
|
National Unverisity of Singapore(NUS), Singapore
Ph.D. in Electrical and Computer Engineering
Aug. 2024 -
|
|
Xi'an Jiaotong University(XJTU), China
B.E. in Artificial Intelligence
Sep. 2020 - Jul. 2024
|
|
Shanghai AI Lab , China
Research Intern
June. 2023 - June. 2024
Focus: Editable 3D Object Generation
|
Honors & Awards
• School Scholarship -- AY20/21, AY21/22
• Summer Workshop at School of Computing of NUS, First Prize -- AY21/22
• Supermarket Shopping Service Robots in China Robot Competition, First Prize -- AY21/22
• China Telecom First Class Scholarship -- AY22/23
|
Last updated in Dec 2024.
|
|