Biqing Qi is currently a XingQi Researcher at the Shanghai AI Lab and Postdoctoral Researcher at the University of Hong Kong (HKU), collaborating with Professor Yi Ma. He received his Ph.D. from the Key Laboratory of Autonomous Intelligent Unmanned Systems (AIUS) at Harbin Institute of Technology, under the joint supervision of the Center for Collaborative & Conversational Intelligence (C3I) at Tsinghua University, guided by Professors Bowen Zhou and Ligang Wu. He serves as committee members for the Construction and Development of the National Supply Chain AI Application Platform and the Embodied Intelligence Committee of the Chinese Information Processing Society, with his research focusing on machine learning theory, foundational models, and human-machine collaborative systems. His research has contributed to over 40 publications in top-tier conferences and journals, including NeurIPS, CVPR, ICLR, ACL, AAAI, EMNLP, NAACL, TPAMI, TNNLS, and TCSVT. His contributions include: 1) Co-developing the “General-Specialized Integration Intelligence” pathway for AGI with Professor Zhou Bowen’s team; 2) Introducing the concept and framework of interactive continual learning from the perspectives of System 1 and System 2; and 3) Pioneering the validation of a research paradigm for independent hypothesis generation driven by large language models (LLMs). His work has garnered significant media attention and has been implemented in leading technology companies such as Tencent, ByteDance, and Xianyuan. He has served as the Principal Investigator for the National Natural Science Foundation of China (NSFC) Young Fund project, a major project (with funding exceeding 100 million RMB), and a Shanghai municipal project (with funding exceeding 100 million RMB). Additionally, he has played a pivotal role in more than ten major projects, including two under the Ministry of Science and Technology’s 2030 Key Special Project, two major R&D initiatives, and several key projects funded by the National Natural Science Foundation.
齐弼卿,上海人工智能实验室星启研究员,港大博士后,合作导师马毅教授,哈工大、清华联培博士,博士生导师周伯文与吴立刚教授。中国工程院科技知识中心技术专班成员, 国家供应链人工智能应用平台建设发展咨询委员会委员, 中文信息学会具生智能专委会委员,研究领域包括可持续机器学习理论、基础模型及人机协同系统。在NeurIPS、CVPR、ICLR、ACL、AAAI、EMNLP、NAACL、TPAMI、TNNLS、TCSVT等国际高水平学术期刊和会议上发表论文40余篇。其主要贡献包括:1)与周伯文教授团队共同提出“通专融合智能”AGI发展路径;2)提出交互式持续学习概念与框架:系统和系统2视角;3)首次验证大模型驱动独立假设提出的研究范式,相关成果受到多家媒体关注与报道,并在腾讯、字节、衔远等科技公司落地应用。主持国家自然基金青年项目,国家重大项目课题(亿级),上海市重大项目课题(亿级)。并参与十余项国家级重大科研项目,包括科技部2030重点专项、国家重大研发计划项目及国家自然科学基金重点项目等。
If you are seeking any form of academic collaborations with Shanghai AI Lab or AIUS, SCIR Lab at HIT and Tsinghua C3I Lab, please feel free to email me at qibiqing7@gmail.com or qibiqing@pjlab.org.cn
🔥 News
- 2025.09: 🔥”ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data” released on Project Page Paper Link
- 2025.09: 🎉 Three papers are accepted by NeurIPS 2025
- 2025.09: 🔥”A Survey of Reinforcement Learning for Large Reasoning Models” released on Paper Link
- 2025.08: 🔥”InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency” released on Project Page Paper Link
- 2025.08: 🎉 One paper is accepted by EMNLP 2025
- 2025.08: 🔥”SDAR (Synergy of Diffusion and AutoRegression), a large diffusion language model(1.7B, 4B, 8B, 30B)” released on Project Page Paper Link
- 2025.07: 🎉 One paper is accepted by ACM MM 2025
- 2025.06: 🔥”MARTI: A Framework for LLM-based Multi-Agent Reinforced Training and Inference” released on Project Page
- 2025.06: 🔥”Scienceboard: Evaluating multimodal autonomous agents in realistic scientific workflows” released on Project Page
- 2025.05: 🎉 three papers are accepted by ACL 2025 (One oral and be invited to pannel discussion, 0.8%)
- 2025.04: 🎉 One paper is accepted by ICML 2025
- 2025.02: 🎉 One paper is accepted by CVPR 2025 (Highlight, Top 2.5%)
- 2025.02: 🔥”Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling” released on Project Page
- 2025.01: 🎉 Two papers are accepted by ICLR 2025 and TCSVT 2025
- 2024.12: 🎉 Two papers are accepted by AAAI 2025 (One Oral)
- 2024.10: 🎉 Four papers are accepted by NeurIPS 2024(One Dataset Track)
- 2024.09: 🎉 Two papers are accepted by EMNLP 2024 (One Findings)
- 2024.07: 🎉 Two papers are accepted by COLM 2024 and ACM MM 2024
- 2024.05: 🎉 Two papers are accepted by ACL 2024 (One Findings)
- 2024.02: 🎉 Two papers are accepted by CVPR 2024 and SPL 2024
- 2023.10: 🎉 Two papers are accepted by NAACL 2024 (Oral)
- 2023.08: 🎉 Two papers are accepted by NeurIPS 2023 and TNNLS 2023
📝 Publications
- Notes:(*)indicates the equal contributions and(†)indicates the corresponding author.
🎙 Multimodal Foundation Models

Arxiv
Position Paper
Towards Building Specialized Generalist AI with System 1 and System 2 Fusion, Kaiyan Zhang*, Biqing Qi*, Bowen Zhou.

Arxiv
Survey Paper
A Survey of Reinforcement Learning for Large Reasoning Models, Kaiyan Zhang, Yuxin Zuo, Bingxiang He, Youbang Sun, Runze Liu, Che Jiang, Yuchen Fan, Kai Tian, Guoli Jia, Pengfei Li, Yu Fu, Xingtai Lv, Yuchen Zhang, Sihang Zeng, Shang Qu, Haozhan Li, Shijie Wang, Yuru Wang, Xinwei Long, Fangfu Liu, Xiang Xu, Jiaze Ma, Xuekai Zhu, Ermo Hua, Yihao Liu, Zonglin Li, Huayu Chen, Xiaoye Qu, Yafu Li, Weize Chen, Zhenzhao Yuan, Junqi Gao, Dong Li, Zhiyuan Ma, Ganqu Cui, Zhiyuan Liu, Biqing Qi†, Ning Ding, Bowen Zhou.

Technical Report
Multimodal Large Language Models
InternVL3. 5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency, Weiyun Wang, Zhangwei Gao, Lixin Gu, Hengjun Pu, Long Cui, Xingguang Wei, Zhaoyang Liu, Linglin Jing, Shenglong Ye, Jie Shao, Zhaokai Wang, Zhe Chen, Hongjie Zhang, Ganlin Yang, Haomin Wang, Qi Wei, Jinhui Yin, Wenhao Li, Erfei Cui, Guanzhou Chen, Zichen Ding, Changyao Tian, Zhenyu Wu, Jingjing Xie, Zehao Li, Bowen Yang, Yuchen Duan, Xuehui Wang, Songze Li, Xiangyu Zhao, Haodong Duan, Nianchen Deng, Bin Fu, Yinan He, Yi Wang, Conghui He, Botian Shi, Junjun He, Yingtong Xiong, Han Lv, Lijun Wu, Wenqi Shao, Kaipeng Zhang, Huipeng Deng, Biqing Qi, Jiaye Ge, Qipeng Guo, Wenwei Zhang, Wanli Ouyang, Limin Wang, Min Dou, Xizhou Zhu, Tong Lu, Dahua Lin, Jifeng Dai, Bowen Zhou, Weijie Su, Kai Chen, Yu Qiao, Wenhai Wang, Gen Luo.

Technical Report
Hybrid Diffusion Language Models
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation, Shuang Cheng, Yihan Bian, Dawei Liu, Yuhua Jiang, Yihao Liu, Linfeng Zhang, Wenhai Wang, Qipeng Guo, Kai Chen, Biqing Qi†, Bowen Zhou
- Low-Cost AR-to-BlockDiffusion
- 2-4× Faster Inference
- Advanced performance on science reasoning bechmarks (e.g., GPQA and ChemBench)

CVPR 2024
Continual Learning
Cognition-Inspired
Interactive continual learning: Fast and slow thinking, Biqing Qi, Xinquan Chen, Junqi Gao, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou,
- This work was the first to propose the concept of interactive continual learning.
- Instantiated through the Cognitive Complementarity Theory (System1 and System2).
- An advanced continual learning framework with the novel structured key-value pairs memory unit.
- A potential framework to develop Specialized Generalist AI.

ACL 2025
Alignment
(Oral) Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process, Eermo Hua, Biqing Qi†, Kaiyan Zhang, Yue Yu, Ning Ding, Xintai Lv, Kai Tian, Bowen Zhou.

NeurIPS 2025
Reasoning
Reinforcement Learning
TTRL: Test-time reinforcement learning, Yuxin Zuo, Kaiyan Zhang, Shang Qu, Li Sheng, Xuekai Zhu, Biqing Qi, Youbang Sun, Ganqu Cui, Ning Ding, Bowen Zhou.

TCSVT 2025
Continual Learning
Contrastive Augmented Graph2Graph Memory Interaction for Few Shot Continual Learning, Biqing Qi, Junqi Gao, Xingquan Chen, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou.

ICML 2025
Position Embedding
Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization, Ermo Hua, Che Jiang, Xingtai Lv, Kaiyan Zhang, Ning Ding, Youbang Sun, Biqing Qi†, Yuchen Fan, Xue Kai Zhu, Bowen Zhou.
Arxiv
Diffusion Language Models
Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference, Jianuo Huang, Yaojie Zhang, Yicun Yang, Benhao Huang, Biqing Qi, Dongrui Liu, Linfeng ZhangArxiv
Diffusion Language Models
Self Speculative Decoding for Diffusion Large Language Models, Yifeng Gao, Ziang Ji, Yuxuan Wang, Biqing Qi, Hanlin Xu, Linfeng ZhangArxiv
Diffusion Language Models
Sequential Diffusion Language Models, Yangzhou Liu, Yue Cao, Hao Li, Gen Luo, Zhe Chen, Weiyun Wang, Xiaobo Liang, Biqing Qi, Lijun Wu, Changyao Tian, Yanting Zhang, Yuqiang Li, Tong Lu, Yu Qiao, Jifeng Dai, Wenhai Wang-
Arxiv
Diffusion Language Models
Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs, Xiangqi Jin, Yuxuan Wang, Yifeng Gao, Zichen Wen, Biqing Qi, Dongrui Liu, Linfeng Zhang NeurIPS 2024
Countinual Learning
An Efficient Memory Module for Graph Few-Shot Class-Incremental Learning, Dong Li, Aijia Zhang, Junqi Gao, Biqing Qi†.NAACL 2024
Reasoning
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning, Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, Bowen Zhou.Arxiv
Alignment
Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing, Biqing Qi, Pengfei Li, Fangyuan Li, Junqi Gao, Kaiyan Zhang, Bowen Zhou.ACL 2024 (Findings)
Model Architecture
SMR: State Memory Replay for Long Sequence Modeling, Biqing Qi, Junqi Gao, Kaiyan Zhang, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou.EMNLP 2024 (Findings)
Model Architecture
On the token distance modeling ability of higher RoPE attention dimension, Xiangyu Hong, Che Jiang, Biqing Qi†, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou.NeurIPS 2024
Model Architecture
Neural Residual Diffusion Models for Deep Scalable Vision Generation,Zhiyuan Ma, Liangliang Zhao, Biqing Qi, Bowen Zhou.ACM MM 2025
Sturctured Memory
T-GRAG: Temporal Graph Retrieval Augmented Generation, Dong Li, Yichen Niu, Ying Ai, Xiang Zou, Biqing Qi†, Jianxing Liu.AAAI 2025
Optimizer
(Oral) Fast and Slow Gradient Approximation for Binary Neural Network Optimization, Xinquan Chen, Junqi Gao, Biqing Qi†, Dong Li, Yiang Luo, Fangyuan Li, Pengfei Li.
🌱 Multi-Agents Systems

Technical Report
Multi Agent Systems
Marti: A framework for multi-agent llm systems reinforced training and inference, Kaiyan Zhang, Runze Liu, Xuekai Zhu, Kai Tian, Sihang Zeng, Guoli Jia, Yuchen Fan, Xingtai Lv, Yuxin Zuo, Che Jiang, Ziyang Liu, Jianyu Wang, Yuru Wang, Ruotong Zhao, Ermo Hua, Yibo Wang, Shijie Wang, Junqi Gao, Xinwei Long, Youbang Sun, Zhiyuan Ma, Ganqu Cui, Lei Bai, Ning Ding, Biqing Qi†, Bowen Zhou.

CVPR 2025
Model Merging
(Highlight) Less is More: Efficient Model Merging with Binary Task Switch, Biqing Qi, Fangyuan Li, Zhen Wang, Junqi Gao, Dong Li, Peng Ye, Bowen Zhou.
- Abstarct: As an effective approach to equip models with multi-task capabilities without additional training, model merging has garnered significant attention. However, existing merging methods face challenges of redundant parameter conflicts and the excessive storage burden of fine-tuned parameters. In this work, through controlled experiments, we reveal that for fine-tuned task vectors, only those parameters with magnitudes above a certain threshold contribute positively to the task, exhibiting a pulse-like characteristic. We then attempt leveraging this pulse-like characteristic to binarize the task vectors and reduce storage overhead.

NeurIPS 2025
Model Merging
Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration, Junqi Gao, Zhichang Guo, Dazhi Zhang, Dong Li, Runze Liu, Pengfei Li, Kai Tian, Biqing Qi†.

Arxiv
Test Time Scaling
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling, Runze Liu, Junqi Gao, Jian Zhao, Kaiyan Zhang, Xiu Li, Biqing Qi†, Wanli Ouyang and Bowen Zhou.

Arxiv
Test Time Scaling
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning, Jian Zhao, Runze Liu, Kaiyan Zhang, Zhimu Zhou, Junqi Gao, Dong Li, Jiafei Lyu, Zhouyi Qian, Biqing Qi†, Xiu Li, Bowen Zhou.

ACL 2025
Test Time Scaling
Graph Counselor: Adaptive Graph Exploration via Multi-Agent Synergy to Enhance LLM Reasoning, Junqi Gao, Xiang Zou, Ying Ai, Dong Li, Yichen Niu, Biqing Qi†, Jianxing Liu.
ICLR 2025
Test Time Scaling
OpenPRM: Building Open-domain Process-based Reward Models with Preference Trees, Kaiyan Zhang, Jiayuan Zhang, Haoxin Li, Xuekai Zhu, Ermo Hua, Xingtai Lv, Ning Ding, Biqing Qi, Bowen Zhou.
👄 Applications
COLM 2024
Scientific Discovery
Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation, Biqing Qi, Kaiyan Zhang, Kai Tian, Haoxiang Li, Zhang-Ren Chen, Sihang Zeng, Ermo Hua, Hu Jinfang, Bowen Zhou.Instruct Following@NeurIPS 2023
Scientific Discovery
Large Language Models are Zero Shot Hypothesis Proposers, Biqing Qi, Kaiyan Zhang, Haoxiang Li, Kai Tian, Sihang Zeng, Zhang-Ren Chen, Jin-Fang Hu, Bowen Zhou.Arxiv
Gui Agents
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data, Zhaoyang Liu, JingJing Xie, Zichen Ding, Zehao Li, Bowen Yang, Zhenyu Wu, Xuehui Wang, Qiushi Sun, Shi Liu, Weiyun Wang, Shenglong Ye, Qingyun Li, Zeyue Tian, Gen Luo, Xiangyu Yue, Biqing Qi, Kai Chen, Bowen Zhou, Yu Qiao, Qifeng Chen, Wenhai Wang.Arxiv
GUI Agents
Scientific Discovery
Scienceboard: Evaluating multimodal autonomous agents in realistic scientific workflows Qiushi Sun, Zhoumianze Liu, Chang Ma, Zichen Ding, Fangzhi Xu, Zhangyue Yin, Haiteng Zhao, Zhenyu Wu, Kanzhi Cheng, Zhaoyang Liu, Jianing Wang, Qintong Li, Xiangru Tang, Tianbao Xie, Xiachong Feng, Xiang Li, Ben Kao, Wenhai Wang, Biqing Qi, Lingpeng Kong, Zhiyong Wu.NeurIPS 2024 D&B Track
Scientific Discovery
(Spotlight) UltraMedical: Building Specialized Generalists in Biomedicine, Kaiyan Zhang, Sihang Zeng, Eermo Hua, Ning Ding, Zhang-Ren Chen, Zhiyuan Ma, Hhaoxiang Li, Ganqu Cui, Biqing Qi, Xuekai Zhu, Bowen Zhou,.
ACL 2025
Scientific Discovery
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System, Haoyang Su, Renqi Chen, SHIXIANG TANG, Zhenfei Yin, Xinzhe Zheng, Jinzhe Li, Biqing Qi, Qi Wu, Hui Li, Wanli Ouyang, Philip Torr, Bowen Zhou, Nanqing Dong.Arxiv
Scientific Discovery
SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy, Zhuo Yang, Jiaqing Xie, Shuaike Shen, Daolang Wang, Yeyun Chen, Ben Gao, Shuzhou Sun, Biqing Qi, Dongzhan Zhou, Lei Bai, Linjiang Chen, Shufei Zhang, Jun Jiang, Tianfan Fu, Yuqiang Li.EMNLP 2024
Embodied Agents
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making, Dayuan Fu*, Biqing Qi†, Yihuai Gao, Che Jiang, Guanting Dong, Bowen Zhou.EMNLP 2025
Scientific Discovery
ReviewRL: Towards Automated Scientific Review with RL, Sihang Zeng, Kai Tian, Kaiyan Zhang, Yuru wang, Junqi Gao, Runze Liu, Sa Yang, Jingxuan Li, Xinwei Long, Jiaheng Ma, Biqing Qi†, Bowen Zhou.Arxiv
GUI Agents
OS-MAP: How Far Can Computer-Using Agents Go in Breadth and Depth?, Xuetian Chen, Yinghao Chen, Xinfeng Yuan, Zhuo Peng, Lu Chen, Yuekeng Li, Zhoujia Zhang, Yingqian Huang, Leyan Huang, Jiaqing Liang, Tianbao Xie, Zhiyong Wu, Qiushi Sun, Biqing Qi†, Bowen Zhou.
🌃 Teams
Team members
- Shijie Wang, Ph.D., Reseacher, Institute of Automation.
Interns
Foundation Models
- Yihao Liu, 2025.03-, 4th-yr Ph.D. candidate, Tsinghua University, IIIS.
- Ermo Hua, 2025,07-, 3th-yr Ph.D. candidate, Tsinghua University.
- Yuhua Jiang, 2025.02-, 2nd-yr Ph.D. candidate, Tsinghua Univeristy.
- Yicheng Gu, 2025.06-, 1st-yr Ph.D. candidate, Tsinghua University.(Joint Supervison)
- Shuang Cheng, 2024.11-, 1st-yr Ph.D. candidate, Zhejiang University.(Joint Supervison)
- Dawei Liu, 2024.11-, 1st-yr Ph.D. candidate, Shanghai Jiao Tong University.(Joint Supervison)
Multi-Agents Systems
- Yikun Fu, 2025.09-, 1st-yr Ph.D. candidate, Shanghai Jiao Tong Univeristy.(Joint Supervison)
- Xiaowei Sun, 2025.9-, 1st-yr Ph.D. candidate, Fudan University.
Visiting Students
- Junqi Gao, 2nd-yr Ph.D. candidate, Harbin Institute of Technology.
- Dong Li, 2nd-yr Ph.D. candidate, Harbin Institute of Technology.
- Siqi Song, 1st-yr Ph.D. candidate, Tsinghua University.
- Nuanqiao Shan, 1st-yr Ph.D. candidate, Zhejiang University.
- Shuaike Shen, 1st-yr Ph.D. Candidate, Carnegie Mellon University.
Alumni Interns and Visiting Students
- Cheng Yang, Yihan Di, Yanlin Pan, Tianhe Lin, Yizhuo Di, Xuetian Chen, Xingfeng Yuan, Yinghao Cheng, Linan Chang, Runze Liu, Xunzhe Zhou, Jing Xiao, Yu Zhang, Yongjia Yu, Qianru Lin, Yifan Hu, Gunbing Zhang.
⚔ Projects
Commodity Price Risk Prediction and Demonstration Application Sep.2023-Sep.2026
- (Key Participants) National Science and Technology Major Project:
- Responsible for the technical planning of Project 2 and leading the team in advancing the construction of the labeling system within LLMs.
Research on Theory and Applications of Human-AI Collaboration with LLMs Jan.2024-Jan.2027
- (Key Participants) National Science and Technology Major Project:
- Responsible for designing the project architecture, planning technical aspects, and overseeing the development of human-machine collaborative systems, along with conducting applied research in knowledge discovery for Project 3.
Cognitive Load Optimization in Human-Machine Collaboration Mar.2023-Dec.2026
- (Participated) Key Research Program of the Ministry of Science and Technology in 2030:
- Responsible for project management within Tsinghua Group, as well as interaction modeling and reflective framework optimization in LLMs.
Research for Product Insight, Design, Development to Marketing Innovation Sep.2023-Dec.2025
- Participated)Beijing Municipal Science and Technology Commission Key Project.
- Responsible for project architecture, planing technical aspects.
Proteomics Data based Knowledge Discovery Mar.2022-Dec.2023
- (Student Lead) Preliminary Research Project for Major Scientific Plan.
- Responsible for project architecture, planning technical aspects, and guiding the design of human-AI systems with respect to hypothesis proposers.
Demonstration of Personified Human-Machine Dialogue System Mar.2020-Dec.2023
- (Participated) Key Research Program of the Ministry of Science and Technology in 2030:
- Responsible for the development of a robust dialogue intent detection method.