Zeyu Zhu

I am a first-year PhD student at Showlab supervised by Prof. Mike Shou in National University of Singapore.

And I am very fortunate to be advised by Prof. Xiangyong Cao and Prof. Deyu Meng throughout my undergraduate years.

Email  /  Scholar  /  Github

profile photo

Research

I'm interested in computer vision and generative AI: Diffusion, Agent for Video Generation.

PontTuset Paper2Video: Automatic Video Generation from Scientific Papers
Zeyu Zhu*, Kevin Qinghong Lin* , Mike Zheng Shou
Scaling Environments for Agents Workshop at NeurIPS, 2025
arxiv / page / github / datasets / paper of the day on huggingface

We introduce Paper2Video, a benchmark and multi-agent framework (PaperTalker) that automates academic presentation video generation from papers, integrating slides, subtitles, speech, and talking-heads with evaluation metrics to ensure faithfulness and informativeness.

PontTuset Multi-human Interactive Talking Datasets
Zeyu Zhu, Weijia Wu, Mike Zheng Shou
Arxiv, 2025
arxiv / page / github / datasets

We introduces a multi-human talking dataset (MIT) and baseline model (CovOG) for generating realistic multi-human talking videos, addressing the limitations of single-speaker approaches.

PontTuset MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu, Zeyu Zhu, Mike Zheng Shou
Arxiv, 2025
page / arxiv

We propose MovieAgent, a multi-agent Chain-of-Thought framework that automates long-form movie generation from scripts, coordinating scene planning, cinematography, and character consistency to achieve coherent, faithful, and fully automated film production.

PontTuset MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu, Mingyu Liu, Zeyu Zhu, Xi Xia, Haoen Feng, Wen Wang, Kevin Qinghong Lin, Chunhua Shen, Mike Zheng Shou
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025
page / arxiv

We introduce MovieBench, a hierarchical movie-level dataset featuring long, multi-scene videos with coherent narratives and consistent characters, designed to benchmark and advance research in long-form video generation.

PontTuset Unsupervised Hyperspectral Pansharpening via Low-rank Diffusion Model
Xiangyu Rui, Xiangyong Cao, Li Pang, Zeyu Zhu, Zongsheng Lyu, DeyuMeng
Information Fusion(IF=18.6), 2024
github / arxiv

We propose a low-rank diffusion model for hyperspectral pansharpening by leveraging the power of the pre-trained deep diffusion model and better generalization ability of Bayesian methods.

PontTuset Probability-based Global Cross-modal Upsampling for Pansharpening
Zeyu Zhu, Xiangyong Cao, Man Zhou, Junhao Huang, Deyu Meng
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023
github / arxiv

We propose a novel probability-based global cross-modal upsampling (PGCU) method for pan-sharpening to to exploit global information of the LRMS image as well as the cross-modal information of the PAN image which can be plug-and-played into existing models.

Education

PontTuset National Unverisity of Singapore(NUS), Singapore

Ph.D. in Electrical and Computer Engineering

Aug. 2024 -

PontTuset Xi'an Jiaotong University(XJTU), China

B.E. in Artificial Intelligence

Sep. 2020 - Jul. 2024

Experience

PontTuset Shanghai AI Lab , China
Research Intern
June. 2023 - June. 2024
Focus: Editable 3D Object Generation

Honors & Awards

School Scholarship -- AY20/21, AY21/22
Summer Workshop at School of Computing of NUS, First Prize -- AY21/22
Supermarket Shopping Service Robots in China Robot Competition, First Prize -- AY21/22
China Telecom First Class Scholarship -- AY22/23


Last updated in Dec 2024.