Shizhe Diao

I am a research scientist at NVIDIA Research. I completed my Ph.D. at the Hong Kong University of Science and Technology, advised by Professor Tong Zhang. Previously, I was a visiting scholar at BLENDER LAB@UIUC, working with my amazing host, Professor Heng Ji. I was a research intern at ByteDance AI Lab with Dr. Hang Li and Dr. Xinsong Zhang, and at Sinovation Ventures AI Institute. During my undergraduate study, I interned at WING Group @ NUS and PKU, where I was fortunate to work with Prof. Min-Yen Kan and Prof. Xiaojun Wan. I am passionate about the research in pre-training, efficient-tuning, and alignment of large foundation models.
We are hiring interns for efficient language model pre-training / post-training. Please contact me if you are interested.

News

🚨 RL Scaling Alert: Introducing ProRL, a novel training recipe that scales RL to 2k+ steps, powering the world’s leading 1.5B reasoning model: Nemotron-Research-Reasoning-Qwen-1.5B.
Excited to share Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding.
Excited to share CLIMB — Clustering-based Iterative Data Mixture Bootstrapping for efficient LLM pre-training and we open-sourced two high-quality datasets: ClimbLab (1.2T tokens across 20 domains) and ClimbMix (400B tokens, outperforming existing baselines under the same budget). Visit Homepage.
Our paper on Hymba and LongMamba got accepted by ICLR 2025!
A step on small language model. We propose Hymba, a hybrid-head architecture surpassing all sub-2B public models in performance including Llama-3.2 and Qwen-2.5.
A new training method on process reward model has been proposed. Please check out our blog Entropy-Regularized Process Reward Model.
One paper got accepted by NeurIPS 2024!
Five papers got accepted by EMNLP 2024!

More News

Excited to join NVIDIA Research as a research scientist!
Excited to share our R-Tuning got Outstanding Paper award@NAACL 2024, and LMFlow got Best Demo Paper award@NAACL 2024!
One paper was accepted by ACL 2024 System Demonstration Track.
Three papers were accepted by ACL 2024 and Findings including Active-Prompt, Directional Preference Alignment, and Prompt Learning using Metaheuristic.
LMFlow got accepted to NAACL 2024 Demo track!
R-Tuning got accepted to NAACL 2024! LLMs could say I Don't Know now! #Alignment for Honesty
One paper was accepted by WWW 2024.
One paper was accepted by EACL 2024.
The RAFT paper for alignment was accepted by TMLR 2023.
Three papers were accepted by EMNLP 2023 and Findings.
Visited BLENDER LAB@UIUC from August 2023 to January 2024.
Attended ICML 2023 at Hawaii.
One paper was accepted by ICCV 2023.
Two papers were accepted by ACL 2023.
Attended ICLR 2023 at Kigali, Rwanda.
Attended EMNLP 2022 at Abu Dhabi.
LMFlow is a framework that allows fine-tuning and deploying personalized LLMs with minimal cost and effort. It has accumulated 7, 000+ stars⭐️ on Github. We envision that LMFlow will enable more creative and diverse applications of LLMs and foster a wider community of LLM enthusiasts!
Check out this curated paper list about ChatGPT with the goal of helping everyone learn the techniques behind it.

Publications

Hanning Zhang, Pengcheng Wang, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, Tong Zhang
Entropy-Regularized Process Reward Model [NEW]
TMLR 2025
Xin Xu, Shizhe Diao, Can Yang, Yang Wang
Can We Verify Step by Step for Incorrect Answer Detection? [NEW]
IJCAI 2025
Xin Dong, Yonggan Fu, Shizhe Diao,, Wonmin Byeon, Zijia Chen, Ameya Sunil Mahabaleshwarkar, Shih-Yang Liu, Matthijs Van Keirsbilck, Min-Hung Chen, Yoshi Suhara, Yingyan Lin, Jan Kautz, Pavlo Molchanov
Hymba: A Hybrid-head Architecture for Small Language Models
ICLR 2025
Zhifan Ye, Kejing Xia, Yonggan Fu, Xin Dong, Jihoon Hong, Xiangchi Yuan, Shizhe Diao, Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
ICLR 2025
Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
NeurIPS 2024
Tianyang Xu, Shujin Wu, Shizhe Diao, Xiaoze Liu, Xingyao Wang, Yangyi Chen, Jing Gao
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales
EMNLP 2024
Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang
Mitigating the Alignment Tax of RLHF
EMNLP 2024
KaShun Shum, Minrui Xu, Jianshu Zhang, Zixin Chen, Shizhe Diao, Hanze Dong, Jipeng Zhang, Muhammad Omer Raza
FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation
EMNLP 2024
Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
EMNLP 2024
Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs
EMNLP 2024
Cheng Niu, Yang Guan, Yuanhao Wu, Juno Zhu, Juntong Song, Randy Zhong, Kaihua Zhu, Siliang Xu, Shizhe Diao, Tong Zhang.
VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning
ACL 2024 System Demonstration Track
Shizhe Diao, Pengcheng Wang, Yong Lin, Tong Zhang.
Active Prompting with Chain-of-Thought for Large Language Models
ACL 2024
Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
ACL 2024
Rui Pan*, Shuo Xing*, Shizhe Diao*, Wenhe Sun*, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang
Plum: Prompt Learning using Metaheuristic
Findings of ACL 2024
Shizhe Diao*, Rui Pan*, Hanze Dong*, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
NAACL 2024 Demo Track (Best Demo Paper)
Hanning Zhang*, Shizhe Diao*, Yong Lin*, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang
R-Tuning: Instructing Large Language Models to Say `I Don't Know'
NAACL 2024 (Outstanding Paper)
Xu Liu, Junfeng Hu, Yuan Li, Shizhe Diao, Yuxuan Liang, Bryan Hooi, Roger Zimmermann
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting
WWW 2024
Quyet V. Do, Tianqing Fang, Shizhe Diao, Zhaowei Wang, Yangqiu Song.
ConstraintChecker: A Plugin for Large Language Models to Reason on Commonsense Knowledge Bases
EACL 2024
Hanze Dong*, Wei Xiong*, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang.
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
TMLR
Kashun Shum*, Shizhe Diao*, Tong Zhang.
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
Findings of EMNLP 2023
Renjie Pi*, Jiahui Gao*, Shizhe Diao*, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang
DetGPT: Detect What You Need via Reasoning
EMNLP 2023
Shizhe Diao*, Yongyu Lei*, Liangming Pan, Tianqing Fang, Wangchunshu Zhou, Sedrick Scott Keh, Min-Yen Kan, Tong Zhang.
Doolittle: Benchmarks and Corpora for Academic Writing Formalization
EMNLP 2023
Zhihong Chen*, Shizhe Diao*, Benyou Wang, Guanbin Li, Xiang Wan.
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
ICCV 2023
Shizhe Diao*, Tianyang Xu*, Ruijia Xu, Jiawei Wang, Tong Zhang.
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memories
ACL 2023
Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou Wang.
On the Difference of BERT-style and CLIP-style Text Encoders
Findings of ACL 2023
Shizhe Diao, Wangchunshu Zhou, Xinsong Zhang, Jiawei Wang.
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
ICLR 2023
Shizhe Diao, Zhichao Huang, Ruijia Xu, Xuechun Li, Yong Lin, Xiao Zhou, Tong Zhang.
Black-box Prompt Learning for Pre-trained Language Models
TMLR
Shizhe Diao*, Sedrick Scott Keh*, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang.
Hashtag-Guided Low-Resource Tweet Classification
WWW 2023
Wangchunshu Zhou*, Yan Zeng*, Shizhe Diao*, Xinsong Zhang*.
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
ICML 2022
Xiao Zhou, Weizhong Zhang, Zonghao Chen, Shizhe Diao, Tong Zhang.
Efficient Neural Network Training via Forward and Backward Propagation Sparsification
NeurIPS 2021
Shizhe Diao, Ruijia Xu, Hongjin Su, Yilei Jiang, Yan Song, Tong Zhang.
Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation
ACL 2021
Shizhe Diao*, Xinwei Shen*, KaShun SHUM, Yan Song, Tong Zhang.
TILGAN: Transformer-based Implicit Latent GAN for Diverse and Coherent Text Generation
Findings of ACL 2021
Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang.
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations
Findings of EMNLP 2020

Recent Talks

[2024/11] Invited Lecture @ University of Michigan, Ann Arbor, CSE 595: Natural Language Processing. Sharing hybrid small language models hosted by Prof. Joyce Y. Chai
[2023/11] Invited Talk @ Beijing Normal University.
[2023/10] Invited Lecture @ Ontario Tech University / University of Toronto, graduate NLP course (CSC401/2511). Hosted by Prof. Annie Lee
[2023/08] Invited Talk @ CVMI Lab, The University of Hong Kong. Hosted by Prof. Xiaojuan Qi
[2023/06] Invited Talk @ University of Toronto. Hosted by Prof. Qiang Sun
[2023/06] Invited Talk @ Shanghai AI Lab [Recording]
[2023/06] Invited Talk @ 4Paradigm
[2023/05] Invited Talk @ Stanford University. Hosted by Prof. Mert Pilanci

Honors and Awards

Best Demo Paper at NAACL 2024
Outstanding Paper Award at NAACL 2024
First Runner-up, School of Engineering PhD Research Excellence Award 2024, HKUST
HKUST Overseas Research Award, 2023
HKUST RedBird PhD Scholarship, 2021
Hong Kong PhD Fellowship, 2021-2024
EMNLP 2020, SIGIR 2020, ACL 2020 Student Volunteer, 2020
Merit Student of Beijing, 2019
Outstanding Graduate Student of Beijing, 2019
Top 10 Talent Nomination Award (Only 20 in ~2000), 2018
Meritorious Winner, Interdisciplinary Contest in Modeling (ICM), 2018
Runner-up, World Robot Olympiad (WRO), New Delhi, India, 2016

Experience

Oct. 2021 - Jul. 2022 Research Intern, ByteDance AI Lab, China Vision-Language Foundation Models Advisor: Dr. Hang Li and Dr. Xinsong Zhang
Jun. 2019 - Jan. 2020 Research Intern, Sinovation Ventures AI Institute, China Pre-trained Language Models Advisor: Prof. Yan Song
Apr. 2018 - Oct. 2018 Research Intern, National University of Singapore (NUS), Singapore Semi-supervised End-to-End Dialogue system Advisor: Prof. Min-Yen Kan and Dr. Wenqiang Lei
Mar. 2017 - Mar. 2019 Research Intern, Peking University (PKU), China Multimodal Chinese Poem Generation Advisor: Prof. Xiaojun Wan
Sept. 2017 - Dec. 2017 Exchange Student The Chinese University of Hong Kong (CUHK), HKSAR
Jul. 2017 - Aug. 2017 Visiting Student, Ben-Gurion University of the Negev (BGU), Israel Cyber Security and Business Intelligence

Academic Service

Demo Track Chair: NAACL 2025
Area Chair / Action Editor: ACL ARR 2024, ACL 2024 Workshop Towards Knowledgeable Language Models, The 2nd Workshop on Knowledgeable Foundation Models @ AAAI 25
Journal Reviewer: Transactions on Machine Learning Research (TMLR), Data-centric Machine Learning Research (DMLR), International Journal of Computer Vision (IJCV), Transactions on Knowledge Discovery from Data (TKDD), SIAM Journal on Mathematics of Data Science (SIMODS)
Conference Reviewer: ACL ARR, ACL (2020 - ), ICLR (2024 - ), EMNLP (2020 - ), NAACL (2020 - ), NeurIPS (2022 - ), ICML (2022 - ), WWW (2024 - ), KDD (2023 - ), AAAI (2022 - ), IJCAI (2023 - ), EACL (2022 - ), COLM (2024 - ), COLING (2023 - ), WSDM (2024 - )
Volunteer: EMNLP 2020, SIGIR 2020, ACL 2020

Teaching Assistant

COMP3711 Design and Analysis of Algorithms (Spring 2023)
COMP2011 Programming with C++ (Spring 2022)
COMP3711 Design and Analysis of Algorithms (Fall 2020)
COMP6211E Optimization for Machine Learning (Spring 2020)

Miscellaneous

I used to be an amateur long-distance runner 🏃. Whenever I am not doing research, I love swimming 🏊, kayaking 🚣, windsurfing 🏄, dinghy sailing ⛵, and stand up paddling!
As the captain, I organized a team to participate in the World Robot Olympiad (WRO) and won the second place in India.