I am a fourth-year Ph.D. Candidate at the Department of Computer Science and Technology, Tsinghua University. I am a member of the Pervasive Human-Computer Interaction (PI) Lab, where I am advised by Prof. Yuanchun Shi and Prof. Junliang Xing. I earned my B.Eng. from Tsinghua University in 2022.

My research interests include LLM/VLM based AI Agents, Reinforcement Learning, and Human-Computer Interaction. My research has been published in leading international AI venues, including NeurIPS, CVPR, ICCV, and TMLR. Currently, I focus on developing autonomous AI agents capable of solving complex multi-turn tasks via hybrid knowledge-data mechanisms, effectively leveraging prior knowledge from foundation models alongside environmental feedback.

💡 I'm actively seeking job opportunities related to AI Agents, LLM Post-training, and AI-Native Applications! Open to research and engineering roles. Feel free to reach out!

🔥 News

2026.02: 🎉 GTR-Turbo was accepted by CVPR 2026.
2025.12: We proposed GTR-Turbo, a significant upgrade to the GTR framework that eliminates reliance on costly external teacher models and accelerates the training process.
2025.06: 🎉 GTR was accepted by ICCV 2025.
2025.03: We addressed a critical challenge in RL-based VLM agent training by proposing Guided Thought Reinforcement (GTR), a novel approach that synthesizes the strengths of RL and IL.

📖 Education

2022.08 - Now: Ph.D. Student, Department of Computer Science and Technology, Tsinghua University.
2018.08 - 2022.07: B.Eng., Department of Computer Science and Technology, Tsinghua University.

📝 Publications

GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training

Tong Wei, Yijun Yang, Changhao Zhang, Junliang Xing, Yuanchun Shi, Zongqing Lu, Deheng Ye

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

[Paper], [Code]

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training

Tong Wei*, Yijun Yang*, Junliang Xing, Yuanchun Shi, Zongqing Lu, Deheng Ye

International Conference on Computer Vision (ICCV), 2025

[Paper], [Code]

Revisiting Discrete Soft Actor-Critic

Haibin Zhou, Tong Wei, Zichuan Lin, Junyou Li, Junliang Xing, Yuanchun Shi, Li Shen, Chao Yu, Deheng Ye

Transactions on Machine Learning Research (TMLR), 2025

[Paper], [Code]

Leveraging Privileged Information for Partially Observable Reinforcement Learning. Jinqiu Li, Enmin Zhao, Tong Wei, Junliang Xing, Shiming Xiang. IEEE Transactions on Games (TG), 2025. [Paper], [Code]
Dual Critic Reinforcement Learning under Partial Observability. Jinqiu Li, Enmin Zhao, Tong Wei, Junliang Xing, Shiming Xiang. Advances in Neural Information Processing Systems (NeurIPS), 2024. [Paper]
Lightwrite: Teach handwriting to the visually impaired with a smartphone. Zihan Wu, Chun Yu, Xuhai Xu, Tong Wei, Tianyuan Zou, Ruolin Wang, Yuanchun Shi. The ACM CHI Conference on Human Factors in Computing Systems (CHI), 2021. [Paper]

💻 Internship

2026.06 - Now: @ Tencent. WeChat agent team.
2026.04 - 2026.06: @ Zhipu AI. GLM reasoning & post-training team.
2024.06 - 2026.02: @ Tencent. AI Lab.

🎖 Honors and Awards

2025, 2023: Comprehensive Excellence Scholarship, Tsinghua University.
2024: Tencent Rhino-Bird Elite Talent Program.
2022: Champion of the 2022 World Intelligent Aerial Gaming Competition.
2021: First Prize in 39th Challenge Cup, Tsinghua University.
2020: Grand Prize in 38th Challenge Cup, Tsinghua University.

🔖 Other Information

I served as a Teaching Assistant for several courses at Tsinghua: Fundamentals of Programming (Undergraduate), Professional Practice (Undergraduate), Computer Vision (Graduate).
I am also an Undergraduate Student Counselor of Tsinghua University, managing student affairs and providing career guidance for undergraduates of the CS Department.
I enjoy a variety of sports, such as tennis, basketball, ultimate frisbee, skiing, hiking, etc.

Tong Wei（魏彤）

🔥 News

📖 Education

📝 Publications

💻 Internship

🎖 Honors and Awards

🔖 Other Information

Tong Wei
（魏彤）