I am a fourth-year Ph.D. Candidate at the Department of Computer Science and Technology, Tsinghua University. I am a member of the Pervasive Human-Computer Interaction (PI) Lab, where I am advised by Prof. Yuanchun Shi and Prof. Junliang Xing. I earned my B.Eng. from Tsinghua University in 2022.

My research interests include LLM/VLM based AI Agents, Reinforcement Learning, and Human-Computer Interaction. My research has been published in leading international AI venues, including NeurIPS, CVPR, ICCV, and TMLR. Currently, I focus on developing autonomous AI agents capable of solving complex multi-turn tasks via hybrid knowledge-data mechanisms, effectively leveraging prior knowledge from foundation models alongside environmental feedback.

🔥 News

  • 2026.02:  🎉 GTR-Turbo was accepted by CVPR 2026.
  • 2025.12: We proposed GTR-Turbo, a significant upgrade to the GTR framework that eliminates reliance on costly external teacher models and accelerates the training process.
  • 2025.06:  🎉 GTR was accepted by ICCV 2025.
  • 2025.03: We addressed a critical challenge in RL-based VLM agent training by proposing Guided Thought Reinforcement (GTR), a novel approach that synthesizes the strengths of RL and IL.

📖 Education

  • 2022.08 - now: Ph.D. Student, Department of Computer Science and Technology, Tsinghua University.
  • 2018.08 - 2022.07: B.Eng., Department of Computer Science and Technology, Tsinghua University.

📝 Publications

sym

GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training

Tong Wei, Yijun Yang, Changhao Zhang, Junliang Xing, Yuanchun Shi, Zongqing Lu, Deheng Ye

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

[Paper], [Code]

sym

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training

Tong Wei*, Yijun Yang*, Junliang Xing, Yuanchun Shi, Zongqing Lu, Deheng Ye

International Conference on Computer Vision (ICCV), 2025

[Paper], [Code]

sym

Revisiting Discrete Soft Actor-Critic

Haibin Zhou, Tong Wei, Zichuan Lin, Junyou Li, Junliang Xing, Yuanchun Shi, Li Shen, Chao Yu, Deheng Ye

Transactions on Machine Learning Research (TMLR), 2025

[Paper], [Code]

  • Leveraging Privileged Information for Partially Observable Reinforcement Learning. Jinqiu Li, Enmin Zhao, Tong Wei, Junliang Xing, Shiming Xiang. IEEE Transactions on Games (TG), 2025. [Paper], [Code]
  • Dual Critic Reinforcement Learning under Partial Observability. Jinqiu Li, Enmin Zhao, Tong Wei, Junliang Xing, Shiming Xiang. Advances in Neural Information Processing Systems (NeurIPS), 2024. [Paper]
  • Lightwrite: Teach handwriting to the visually impaired with a smartphone. Zihan Wu, Chun Yu, Xuhai Xu, Tong Wei, Tianyuan Zou, Ruolin Wang, Yuanchun Shi. The ACM CHI Conference on Human Factors in Computing Systems (CHI), 2021. [Paper]

💻 Internship

🎖 Honors and Awards

  • 2025, 2023: Overall Excellence Scholarship, Tsinghua University.
  • 2024: Tencent Rhino-Bird Elite Talent Program.
  • 2022: Champion of the 2022 World Intelligent Aerial Gaming Competition.
  • 2021: First Prize in 39th Challenge Cup, Tsinghua University.
  • 2020: Grand Prize in 38th Challenge Cup, Tsinghua University.

🔖 Other Information

  • I served as a Teaching Assistant for several courses at Tsinghua: Fundamentals of Programming (Undergraduate), Professional Practice (Undergraduate), Computer Vision (Graduate).
  • I am also an Undergraduate Student Counselor of Tsinghua University, managing student affairs and providing career guidance for undergraduates of the CS Department.
  • I enjoy a variety of sports, such as tennis, basketball, ultimate frisbee, skiing, hiking, etc.