About Me

I am a first-year M.S. student in Computer Science at UC San Diego, advised by Prof. Hao Zhang, and I hold a B.S. from ShanghaiTech University advised by Prof. Kewei Tu. My research lies at the intersection of Natural Language Processing and Machine Learning Systems. I am particularly passionate about designing efficient architectures for Long-Context Modeling and exploring the frontiers of World Models to bridge system efficiency with model capability.

Currently, I focus on scalable training and inference for generative models. I am the lead author of FlashMHF, where I proposed a novel Multi-Head FFN architecture backed by IO-aware Triton/CUDA kernels. As a core contributor to FastVideo in Hao AI Lab, I am training action-conditioned video generation models on distributed multi-node clusters and contributed to Dreamverse, achieving real-time 1080p video generation. Previously at Alibaba Ant Group, I integrated Hierarchical Sparse Attention into the SGLang inference framework and built custom Flash GPU kernels in ThunderKittens/CUDA/Triton.

Looking ahead, I aim to extend my work on FlashMHF to broader LLM backbones and delve deeper into World Models within the FastVideo framework. I am also actively exploring retrieval-based methods and Continual Learning to solve the challenges of long-context understanding in foundation models.

Publications

Flash Multi-Head Feed-Forward Network

Flash Multi-Head Feed-Forward Network

Minshen Zhang*, Xiang Hu*, Jianguo Li, Wei Wu, Kewei Tu

arXiv Preprint, 2025

We propose Flash Multi-Head FFN (FlashMHF), a novel architecture replacing standard FFNs in Transformers. Backed by IO-aware Triton/CUDA kernels and dynamic sub-networks, FlashMHF reduces peak memory by 3-5x and accelerates inference while improving performance over SwiGLU.

Projects

Dreamverse: Realtime Video Generation

Dreamverse: Realtime Video Generation

Hao AI Lab

Project, Mar 2026

Achieving 30 seconds of 1080p clip generation with 4.55 seconds of wait time on a single GPU. Contributed heavily to generation consistency by modifying the video model pipeline, and accelerated backend inference by benchmarking and fusing kernels.

FastVideo

FastVideo

Hao AI Lab (Core Contributor)

Open-Source Project, Oct 2025 - Present

Building scalable and efficient training infrastructure for video generation. Training action-conditioned world models and accelerating inference by SOTA distillation methods. Proposed a novel data curation pipeline for high-quality action-labeled video datasets.

Enhancing 3D Character Generation

Enhancing 3D Character Generation with ControlNet and LoRA

Congrong Xu, Zhanhe Shi, Minshen Zhang, Qingcheng Zhao

EECS 182/282A | Deep Neural Networks, UC Berkeley, 2023

A project exploring enhanced 3D character generation techniques using ControlNet and LoRA for improved control and quality in generative models.

CUDA/C++ Parallel Image Rendering

CUDA/C++ Parallel Image Rendering

Minshen Zhang

Personal Project, 2023

Built a C++ path tracer supporting Lambertian, metal, dielectric, and emissive materials. Implemented motion blur, depth of field, and volumetric effects. Accelerated rendering via CUDA parallelization and importance sampling, achieving ~200× speedup vs. single-threaded CPU baseline.

NERF Neural Network

NERF Neural Network

Minshen Zhang

Personal Project, 2023

Built a NERF rendering pipeline by understanding Camera Intrinsics & Extrinsics and Volumetric Rendering. Trained and validated neural model on RTX4090 using open-source multi-perspective image datasets.

Education

University of California, San Diego

Sep 2025 - Dec 2026 (Expected)

Master of Science in Computer Science and Engineering

La Jolla, CA

University of California, Berkeley

Aug 2023 - Jan 2024

Exchange Student, EECS Department

Berkeley, CA

ShanghaiTech University

Sep 2021 - Jun 2025

Bachelor of Engineering in Computer Science and Technology

Shanghai, China

Honors & Awards

2025 Outstanding Graduate of ShanghaiTech University
2024 Outstanding Student, ShanghaiTech University
2024 Teaching Assistant, CS100 Computer Programming, ShanghaiTech University
2022 Outstanding Student, ShanghaiTech University