profile image

Xintao Wang

Contact Me
I am currently a senior staff researcher at KwaiVGI, Kuaishou Technology, leading an effort on visual content generation, especially on video generation.
We are actively looking for research interns and full-time researchers to work on related research topics, including but not limited to image and video generation/editing. Please feel free to drop me an email to xintao.wang@outlook.com if you are interested.

Previously, I was a senior staff researcher atTencent ARC Lab and Tencent AI Lab, where I led an effort on visual content generation (AIGC).
I got my Ph.D. degree from Multimedia Lab (MMLab), the Chinese University of Hong Kong, advised by Prof. Chen Change Loy and Prof. Xiaoou Tang. I also work closely with Prof. Chao Dong. Earlier, I obtained my bachelor's degree from Zhejiang University.

I am currently immersed in the exhilarating field of generative AI, which has been an exciting journey.
● 2D (Image/Video) Generation

  • Controllable Image Generation: T2I-Adapter, PhotoMaker, CustomNet, MasaCtrl, DragonDiffusion, SmartEdit
  • Controllable Video Generation: MotionCtrl, Tune-A-Video
  • Video Foundation Models : VideoCrafter Sereries (VideoCrafter1, DynamiCrafter, EvalCrafter, StyleCrafter, etc).
● 3D Generation
  • Dream3D, GET3D——
● Previously, I worked on Restoration
  • General Image Restoration: Real-ESRGAN, ESRGAN
  • Face Restoration: GFPGAN, VQFR, GLEAN
  • Video Restoration : EDVR, BasicVSR
  • Training Frameworks and others : BasicSR, SFTGAN

News

T2I-Adapter GitHub stars

Dig out controllable ability for text-to-image diffusion models

VideoCrafter GitHub stars

Open sourced large models for video generation

Real-ESRGAN GitHub stars

Practical algorithms for image restoration

GFPGAN GitHub stars

Practical face restoration

BasicSR GitHub stars

Open source image and video restoration toolbox

HandyView GitHub stars

Handy image viewer

Publications [Full List]

(* equal contribution, # corresponding author)
Seleted Preprint
teaser

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

Chong Mou, Xintao Wang#, Liangbin Xie, Yanze Wu, Jian Zhang#, Zhongang Qi, Ying Shan, Xiaohu Qie

arXiv preprint, 2023.   Paper (arXiv)  Codes  GitHub stars

teaser

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

Yunpeng Bai, Xintao Wang#, Yan-Pei Cao, Yixiao Ge, Chun Yuan#, Ying Shan

arXiv preprint, 2023.   Paper (arXiv)  Codes  GitHub stars

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen

arXiv preprint, 2023.   Project Page  Paper (arXiv)  Codes  GitHub stars

teaser

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang

arXiv preprint, 2023.   Project Page  Paper (arXiv)  Codes  GitHub stars

2023
teaser

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Mingdeng Cao, Xintao Wang#, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng#

ICCV, 2023.   Project Page  Paper (arXiv)  Codes  GitHub stars

teaser

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Yufei Shi, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou

ICCV, 2023.   Project Page  Paper (arXiv)  Codes  GitHub stars

Fate/Zero: Fusing Attentions for Zero-shot Text-based Video Editing

Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen

ICCV, 2023.   Project Page  Paper (arXiv)  Codes  GitHub stars

teaser

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Liangbin Xie*, Xintao Wang*, Xiangyu Chen*, Gen Li, Ying Shan, Jiantao Zhou, Chao Dong

ICML, 2023.   Paper (arXiv)  Codes  GitHub stars

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

Jiale Xu, Xintao Wang#, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao#

CVPR, 2023.   Project Page  Paper (arXiv)  Codes (Coming Soon) 

teaser

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer

Fanghua Yu*, Xintao Wang*, Mingdeng Cao, Gen Li, Ying Shan, Chao Dong#

CVPR, 2023.   Paper (arXiv)  Codes  GitHub stars

teaser

Mitigating Artifacts in Real-World Video Super-Resolution Models

Liangbin Xie, Xintao Wang, Shuwei Shi, Jinjin Gu, Chao Dong, Ying Shan

AAAI, 2022.   Paper (arXiv)  Codes  GitHub stars

teaser

Accelerating the Training of Video Super-resolution Models

Lijian Lin, Xintao Wang#, Zhongang Qi, Ying Shan

AAAI, 2022.   Paper (arXiv)  Codes  GitHub stars

2022
teaser

Rethinking Alignment in Video Super-Resolution Transformers

Shuwei Shi, Jinjin Gu, Liangbin Xie, Xintao Wang, Yujiu Yang, Chao Dong

NeurIPS, 2022.   Paper (arXiv)  Codes  GitHub stars

teaser

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Yuchao Gu, Xintao Wang, Liangbie Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng

Selected as Oral (2.7%)
ECCV, 2022.   Paper (arXiv)  Codes  GitHub stars

teaser

MM-RealSR: Metric Learning based Interactive Modulation for Real-World Super-Resolution

Chong Mou, Yanze Wu, Xintao Wang, Chao Dong, Jian Zhang, Ying Shan

ECCV, 2022.   Paper (arXiv)  Codes  GitHub stars

2021 To be updated