I am a research scientist at NVIDIA Research. I completed my Ph.D. at the Hong Kong University of Science and Technology, advised by Professor Tong Zhang. Previously, I was a visiting scholar at BLENDER LAB@UIUC, working with my amazing host, Professor Heng Ji. My research focuses on developing methods to scale up pre-training and reinforcement learning for large foundation models, pushing the frontier of more capable, general-purpose AI agents through data-centric design.
We are hiring interns for building efficient and intelligent agentic systems. Please contact me if you are interested.

News


More News

  • Four papers got accepted by NeurIPS 2025!
  • Two papers got accepted by ICML 2025!
  • Excited to share CLIMB — Clustering-based Iterative Data Mixture Bootstrapping for efficient LLM pre-training and we open-sourced two high-quality datasets: ClimbLab (1.2T tokens, 20 domains) and ClimbMix (400B tokens, SOTA data blend for LLM pre-training). Visit Homepage.
  • Our paper on Hymba and LongMamba got accepted by ICLR 2025!
  • A step on small language model. We propose Hymba, a hybrid-head architecture surpassing all sub-2B public models in performance including Llama-3.2 and Qwen-2.5.
  • A new training method on process reward model has been proposed. Please check out our blog Entropy-Regularized Process Reward Model.
  • One paper got accepted by NeurIPS 2024!
  • Five papers got accepted by EMNLP 2024!
  • Excited to join NVIDIA Research as a research scientist!
  • Excited to share our R-Tuning got Outstanding Paper award@NAACL 2024, and LMFlow got Best Demo Paper award@NAACL 2024!
  • One paper was accepted by ACL 2024 System Demonstration Track.
  • Three papers were accepted by ACL 2024 and Findings including Active-Prompt, Directional Preference Alignment, and Prompt Learning using Metaheuristic.
  • LMFlow got accepted to NAACL 2024 Demo track!
  • R-Tuning got accepted to NAACL 2024! LLMs could say I Don't Know now! #Alignment for Honesty
  • One paper was accepted by WWW 2024.
  • One paper was accepted by EACL 2024.
  • The RAFT paper for alignment was accepted by TMLR 2023.
  • Three papers were accepted by EMNLP 2023 and Findings.
  • Visited BLENDER LAB@UIUC from August 2023 to January 2024.
  • Attended ICML 2023 at Hawaii.
  • One paper was accepted by ICCV 2023.
  • Two papers were accepted by ACL 2023.
  • Attended ICLR 2023 at Kigali, Rwanda.
  • Attended EMNLP 2022 at Abu Dhabi.
  • LMFlow is a framework that allows fine-tuning and deploying personalized LLMs with minimal cost and effort. It has accumulated 7, 000+ stars⭐️ on Github. We envision that LMFlow will enable more creative and diverse applications of LLMs and foster a wider community of LLM enthusiasts!
  • Check out this curated paper list about ChatGPT with the goal of helping everyone learn the techniques behind it.

Publications


  1. Chengyue Wu, Hao Zhang, Shuchen Xue, Zhijian Liu, Shizhe Diao, Ligeng Zhu, Ping Luo, Song Han, Enze Xie
    Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding [NEW]
    ICLR 2026
  2. Chengyue Wu, Hao Zhang, Shuchen Xue, Shizhe Diao, Yonggan Fu, Zhijian Liu, Pavlo Molchanov, Ping Luo, Song Han, Enze Xie
    Fast-dLLM v2: Efficient Block-Diffusion LLM [NEW]
    ICLR 2026
  3. Ruida Wang, Jiarui Yao, Rui Pan, Shizhe Diao, Tong Zhang
    GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving [NEW]
    ICLR 2026
  4. Zhilin Wang, Jaehun Jung, Ximing Lu, Shizhe Diao, Ellie Evans, Jiaqi Zeng, Pavlo Molchanov, Yejin Choi, Jan Kautz, Yi Dong
    ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge [NEW]
    ICLR 2026
  5. Shizhe Diao, Yu Yang, Yonggan Fu, Xin Dong, Dan SU, Markus Kliegl, Zijia Chen, Peter Belcak, Yoshi Suhara, Hongxu Yin, Mostofa Patwary, Yingyan Celine Lin, Jan Kautz, Pavlo Molchanov
    Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
    NeurIPS 2025 Datasets and Benchmarks Track
  6. Mingjie Liu, Shizhe Diao, Ximing Lu, Jian Hu, Xin Dong, Yejin Choi, Jan Kautz, Yi Dong
    ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
    NeurIPS 2025
  7. Yonggan Fu, Xin Dong, Shizhe Diao, Matthijs Van keirsbilck, Hanrong Ye, Wonmin Byeon, Yashaswi Karnati, Lucas Liebenwein, Maksim Khadkevich, Alexander Keller, Jan Kautz, Yingyan Celine Lin, Pavlo Molchanov
    Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
    NeurIPS 2025
  8. Tianhao Chen, Xin Xu, Zijing Liu, Pengxiang Li, Xinyuan Song, Ajay Kumar Jaiswal, Fan Zhang, Jishan Hu, Yang Wang, Hao Chen, Shizhe Diao, Shiwei Liu, Yu Li, Lu Yin, Can Yang
    GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling
    NeurIPS 2025
  9. Ruida Wang, Rui Pan, Yuxin Li, Jipeng Zhang, Yizhen Jia, Shizhe Diao, Renjie Pi, Junjie Hu, Tong Zhang
    MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving
    ICML 2025
  10. Xin Xu, Qiyun Xu, Tong Xiao, Tianhao Chen, Yuchen Yan, Jiaxin Zhang, Shizhe Diao, Can Yang, Yang Wang
    UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
    ICML 2025
  11. Hanning Zhang, Pengcheng Wang, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, Tong Zhang
    Entropy-Regularized Process Reward Model
    TMLR 2025
  12. Xin Xu, Shizhe Diao, Can Yang, Yang Wang
    Can We Verify Step by Step for Incorrect Answer Detection?
    IJCAI 2025
  13. Xin Dong, Yonggan Fu, Shizhe Diao,, Wonmin Byeon, Zijia Chen, Ameya Sunil Mahabaleshwarkar, Shih-Yang Liu, Matthijs Van Keirsbilck, Min-Hung Chen, Yoshi Suhara, Yingyan Lin, Jan Kautz, Pavlo Molchanov
    Hymba: A Hybrid-head Architecture for Small Language Models
    ICLR 2025
  14. Zhifan Ye, Kejing Xia, Yonggan Fu, Xin Dong, Jihoon Hong, Xiangchi Yuan, Shizhe Diao, Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin
    LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
    ICLR 2025
  15. Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang
    LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
    NeurIPS 2024
  16. Tianyang Xu, Shujin Wu, Shizhe Diao, Xiaoze Liu, Xingyao Wang, Yangyi Chen, Jing Gao
    SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales
    EMNLP 2024
  17. Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang
    Mitigating the Alignment Tax of RLHF
    EMNLP 2024
  18. KaShun Shum, Minrui Xu, Jianshu Zhang, Zixin Chen, Shizhe Diao, Hanze Dong, Jipeng Zhang, Muhammad Omer Raza
    FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation
    EMNLP 2024
  19. Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang
    TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
    EMNLP 2024
  20. Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang
    The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs
    EMNLP 2024
  21. Cheng Niu, Yang Guan, Yuanhao Wu, Juno Zhu, Juntong Song, Randy Zhong, Kaihua Zhu, Siliang Xu, Shizhe Diao, Tong Zhang.
    VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning
    ACL 2024 System Demonstration Track
  22. Shizhe Diao, Pengcheng Wang, Yong Lin, Tong Zhang.
    Active Prompting with Chain-of-Thought for Large Language Models
    ACL 2024
  23. Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang
    Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
    ACL 2024
  24. Rui Pan*, Shuo Xing*, Shizhe Diao*, Wenhe Sun*, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang
    Plum: Prompt Learning using Metaheuristic
    Findings of ACL 2024
  25. Shizhe Diao*, Rui Pan*, Hanze Dong*, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang
    LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
    NAACL 2024 Demo Track (Best Demo Paper)
  26. Hanning Zhang*, Shizhe Diao*, Yong Lin*, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang
    R-Tuning: Instructing Large Language Models to Say `I Don't Know'
    NAACL 2024 (Outstanding Paper)
  27. Xu Liu, Junfeng Hu, Yuan Li, Shizhe Diao, Yuxuan Liang, Bryan Hooi, Roger Zimmermann
    UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting
    WWW 2024
  28. Quyet V. Do, Tianqing Fang, Shizhe Diao, Zhaowei Wang, Yangqiu Song.
    ConstraintChecker: A Plugin for Large Language Models to Reason on Commonsense Knowledge Bases
    EACL 2024
  29. Hanze Dong*, Wei Xiong*, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang.
    RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
    TMLR
  30. Kashun Shum*, Shizhe Diao*, Tong Zhang.
    Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
    Findings of EMNLP 2023
  31. Renjie Pi*, Jiahui Gao*, Shizhe Diao*, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang
    DetGPT: Detect What You Need via Reasoning
    EMNLP 2023
  32. Shizhe Diao*, Yongyu Lei*, Liangming Pan, Tianqing Fang, Wangchunshu Zhou, Sedrick Scott Keh, Min-Yen Kan, Tong Zhang.
    Doolittle: Benchmarks and Corpora for Academic Writing Formalization
    EMNLP 2023
  33. Zhihong Chen*, Shizhe Diao*, Benyou Wang, Guanbin Li, Xiang Wan.
    Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
    ICCV 2023
  34. Shizhe Diao*, Tianyang Xu*, Ruijia Xu, Jiawei Wang, Tong Zhang.
    Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memories
    ACL 2023
  35. Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou Wang.
    On the Difference of BERT-style and CLIP-style Text Encoders
    Findings of ACL 2023
  36. Shizhe Diao, Wangchunshu Zhou, Xinsong Zhang, Jiawei Wang.
    Write and Paint: Generative Vision-Language Models are Unified Modal Learners
    ICLR 2023
  37. Shizhe Diao, Zhichao Huang, Ruijia Xu, Xuechun Li, Yong Lin, Xiao Zhou, Tong Zhang.
    Black-box Prompt Learning for Pre-trained Language Models
    TMLR
  38. Shizhe Diao*, Sedrick Scott Keh*, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang.
    Hashtag-Guided Low-Resource Tweet Classification
    WWW 2023
  39. Wangchunshu Zhou*, Yan Zeng*, Shizhe Diao*, Xinsong Zhang*.
    VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
    ICML 2022
  40. Xiao Zhou, Weizhong Zhang, Zonghao Chen, Shizhe Diao, Tong Zhang.
    Efficient Neural Network Training via Forward and Backward Propagation Sparsification
    NeurIPS 2021
  41. Shizhe Diao, Ruijia Xu, Hongjin Su, Yilei Jiang, Yan Song, Tong Zhang.
    Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation
    ACL 2021
  42. Shizhe Diao*, Xinwei Shen*, KaShun SHUM, Yan Song, Tong Zhang.
    TILGAN: Transformer-based Implicit Latent GAN for Diverse and Coherent Text Generation
    Findings of ACL 2021
  43. Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang.
    ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations
    Findings of EMNLP 2020

Recent Talks




Honors and Awards



Experience


Oct. 2021 - Jul. 2022
Research Intern, ByteDance AI Lab, China
Vision-Language Foundation Models
Advisor: Dr. Hang Li and Dr. Xinsong Zhang

Jun. 2019 - Jan. 2020
Research Intern, Sinovation Ventures AI Institute, China
Pre-trained Language Models
Advisor: Prof. Yan Song

Apr. 2018 - Oct. 2018
Research Intern, National University of Singapore (NUS), Singapore
Semi-supervised End-to-End Dialogue system
Advisor: Prof. Min-Yen Kan and Dr. Wenqiang Lei

Mar. 2017 - Mar. 2019
Research Intern, Peking University (PKU), China
Multimodal Chinese Poem Generation
Advisor: Prof. Xiaojun Wan

Sept. 2017 - Dec. 2017
Exchange Student
The Chinese University of Hong Kong (CUHK), HKSAR

Jul. 2017 - Aug. 2017
Visiting Student, Ben-Gurion University of the Negev (BGU), Israel
Cyber Security and Business Intelligence

Academic Service



Teaching Assistant



Miscellaneous