I am now a final-year Ph.D. candidate at the Hong Kong University of Science and Technology, advised by Professor Tong Zhang.
Currently, I am a visiting scholar in BLENDER LAB@UIUC, under the supervision of Professor Heng Ji.
I was a research intern at ByteDance AI Lab with Dr. Hang Li and Dr. Xinsong Zhang, and Sinovation Ventures AI Institute.
I am passionate about the research in pre-training, efficient-tuning, and alignment of large foundation models.
I am currently active on the job market and seeking a position in the industry.
- Three papers were accepted by EMNLP 2023 and Findings.
- I am visiting BLENDER LAB@UIUC from August 2023 to January 2024.
- One paper was accepted by ICCV 2023.
- Two papers were accepted by ACL 2023.
- Our project LMFlow is released! LMFlow is a framework that allows fine-tuning and deploying personalized LLMs with minimal cost and effort. It has accumulated 7, 000+ stars⭐️ on Github. We envision that LMFlow will enable more creative and diverse applications of LLMs and foster a wider community of LLM enthusiasts!
- Check out this curated paper list about ChatGPT with the goal of helping everyone learn the techniques behind it.
- Hanze Dong*, Wei Xiong*, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang.
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment [NEW]
- Kashun Shum*, Shizhe Diao*, Tong Zhang.
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data [NEW]
Findings of EMNLP 2023
- Renjie Pi*, Jiahui Gao*, Shizhe Diao*, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang
DetGPT: Detect What You Need via Reasoning [NEW]
- Shizhe Diao*, Yongyu Lei*, Liangming Pan, Tianqing Fang, Wangchunshu Zhou, Sedrick Scott Keh, Min-Yen Kan, Tong Zhang.
Doolittle: Benchmarks and Corpora for Academic Writing Formalization [NEW]
- Zhihong Chen*, Shizhe Diao*, Benyou Wang, Guanbin Li, Xiang Wan.
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
- Shizhe Diao*, Tianyang Xu*, Ruijia Xu, Jiawei Wang, Tong Zhang.
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memories
- Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou Wang.
On the Difference of BERT-style and CLIP-style Text Encoders
Findings of ACL 2023
- Shizhe Diao, Wangchunshu Zhou, Xinsong Zhang, Jiawei Wang.
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
- Shizhe Diao, Zhichao Huang, Ruijia Xu, Xuechun Li, Yong Lin, Xiao Zhou, Tong Zhang.
Black-box Prompt Learning for Pre-trained Language Models
- Shizhe Diao*, Sedrick Scott Keh*, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang.
Hashtag-Guided Low-Resource Tweet Classification
- Wangchunshu Zhou*, Yan Zeng*, Shizhe Diao*, Xinsong Zhang*.
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
- Xiao Zhou, Weizhong Zhang, Zonghao Chen, Shizhe Diao, Tong Zhang.
Efficient Neural Network Training via Forward and Backward Propagation Sparsification
- Shizhe Diao, Ruijia Xu, Hongjin Su, Yilei Jiang, Yan Song, Tong Zhang.
Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation
- Shizhe Diao*, Xinwei Shen*, KaShun SHUM, Yan Song, Tong Zhang.
TILGAN: Transformer-based Implicit Latent GAN for Diverse and Coherent Text Generation
Findings of ACL 2021
- Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang.
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations
Findings of EMNLP 2020
- Hanning Zhang*, Shizhe Diao*, Yong Lin*, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang
R-Tuning: Teaching Large Language Models to Refuse Unknown Questions [NEW]
arXiv preprint arXiv:2311.09677 (2023)
- Rui Pan, Shuo Xing, Shizhe Diao, Xiang Liu, Kashun Shum, Jipeng Zhang, Tong Zhang
Plum: Prompt Learning using Metaheuristic [NEW]
arXiv preprint arXiv:2311.08364 (2023)
- Ziqiang Zheng, Jipeng Zhang, Tuan-Anh Vu, Shizhe Diao, Yue Him Wong Tim, Sai-Kit Yeung
MarineGPT: Unlocking Secrets of Ocean to the Public [NEW]
arXiv preprint arXiv:2310.13596 (2023)
- Xu Liu, Junfeng Hu, Yuan Li, Shizhe Diao, Yuxuan Liang, Bryan Hooi, Roger Zimmermann
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting [NEW]
arXiv preprint arXiv:2310.09751 (2023)
- Yong Lin*, Lu Tan*, Hangyu Lin*, Zeming Zheng, Renjie Pi, Jipeng Zhang, Shizhe Diao, Haoxiang Wang, Han Zhao, Yuan Yao, Tong Zhang
Speciality vs Generality: An Empirical Study on Catastrophic Forgetting in Fine-tuning Foundation Models [NEW]
arXiv preprint arXiv:2309.06256 (2023)
- Shizhe Diao*, Rui Pan*, Hanze Dong*, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
arXiv preprint arXiv:2306.12420 (2023)
- Shizhe Diao, Pengcheng Wang, Yong Lin, Tong Zhang.
Active Prompting with Chain-of-Thought for Large Language Models
arXiv preprint arXiv:2302.12246 (2023)
- Hanze Dong*, Shizhe Diao*, Weizhong Zhang, Tong Zhang.
Normalizing Flow with Variational Latent Representation
arXiv preprint arXiv:2211.11638 (2022)
- Rui Pan*, Shizhe Diao*, Jianlin Chen, Tong Zhang.
ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT
[Paper] [Code] [Documentation]
arXiv preprint arXiv:2211.17201 (2022)
- [2023/11] Invited Talk @ Beijing Normal University
- [2023/10] Invited Lecture @ Ontario Tech University / University of Toronto, graduate NLP course (CSC401/2511)
- [2023/08] Invited Talk @ CVMI Lab, The University of Hong Kong
- [2023/06] Invited Talk @ University of Toronto
- [2023/06] Invited Talk @ Shanghai AI Lab [Recording]
- [2023/06] Invited Talk @ 4Paradigm
- [2023/05] Invited Talk @ Stanford University
Honors and Awards
- HKUST RedBird PhD Scholarship, 2021
- Hong Kong PhD Fellowship, 2021-2024
- EMNLP 2020, SIGIR 2020, ACL 2020 Student Volunteer, 2020
- Merit Student of Beijing, 2019
- Outstanding Graduate Student of Beijing, 2019
- Top 10 Talent Nomination Award (Only 20 in ~2000), 2018
- Meritorious Winner, Interdisciplinary Contest in Modeling (ICM), 2018
- Runner-up, World Robot Olympiad (WRO), New Delhi, India, 2016
Oct. 2021 - Jul. 2022
Research Intern, ByteDance AI Lab, China
Vision-Language Foundation Models
Advisor: Dr. Hang Li and Dr. Xinsong Zhang
Jun. 2019 - Jan. 2020
Research Intern, Sinovation Ventures AI Institute, China
Pre-trained Language Models
Advisor: Prof. Yan Song
Apr. 2018 - Oct. 2018
Research Intern, National University of Singapore (NUS), Singapore
Semi-supervised End-to-End Dialogue system
Advisor: Prof. Min-Yen Kan and Dr. Wenqiang Lei
Mar. 2017 - Mar. 2019
Research Intern, Peking University (PKU), China
Multimodal Chinese Poem Generation
Advisor: Prof. Xiaojun Wan
Sept. 2017 - Dec. 2017
The Chinese University of Hong Kong (CUHK), HKSAR
Jul. 2017 - Aug. 2017
Visiting Student, Ben-Gurion University of the Negev (BGU), Israel
Cyber Security and Business Intelligence
- Journal Reviewer: SIAM Journal on Mathematics of Data Science (SIMODS)
- Conference Reviewer: ACL ARR, ACL (2020 - 2023), EMNLP (2020 - 2023), NAACL (2020 - 2022), NeurIPS (2022 - 2023), ICML (2022 - 2023), KDD (2023), AAAI (2022), IJCAI (2023), EACL (2022)
- Volunteer: EMNLP 2020, SIGIR 2020, ACL 2020
- COMP3711 Design and Analysis of Algorithms (Spring 2023)
- COMP2011 Programming with C++ (Spring 2022)
- COMP3711 Design and Analysis of Algorithms (Fall 2020)
- COMP6211E Optimization for Machine Learning (Spring 2020)
- I used to be an amateur long-distance runner 🏃. Whenever I am not doing research, I love swimming 🏊, kayaking 🚣, windsurfing 🏄, dinghy sailing ⛵, and stand up paddling!
- As the captain, I organized a team to participate in the World Robot Olympiad (WRO) and won the second place in India.