Weihao Tan 谭伟豪

I am a final-year PhD student at Nanyang Technological University, advised by Prof. Bo An. I led the Lumine project when I was at ByteDance Seed. Feel free to contact me if you'd like to discuss the present and future of game agents.

My research interests lie in reinforcement learning (RL), generative models, and AI agents. Artificial general intelligence (AGI) has been my dream all along. I believe RL is the most potential way to achieve it and game is the perfect incubator of AGI. My goal is to design general intelligent agents outperforming humans across all tasks in cyberspace.

When I am not doing research, I enjoy watching animations and playing games (especially miHoYo's games). An interesting coincidence is that my current advisor's name is almost the same as Amber's, the first local guide and good partner in Genshin Impact, in Chinese.

Publications

I can also be found on Google Scholar.

(“*” indicates equal/core contribution. "†" indicates equal advising.)

Conference Papers

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
Weihao Tan*, Xiangyang Li*, Yunhao Fang*, Heyuan Yao*, Shi Yan*, Hao Luo*, Tenglong Ao, Huihui Li, Hongbin Ren, Bairen Yi, Yujia Qin, Bo An, Libin Liu, Guang Shi
Preprint | [paper] | [website]

Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents
Zihao Wang, Xujing Li, Yining Ye, Junjie Fang, Haoming Wang, Longxiang Liu, Shihao Liang, Junting Lu, Zhiyong Wu, Jiazhan Feng, Wanjun Zhong, Zili Li, Yu Wang, Yu Miao, Bo Zhou, Yuanfan Li, Hao Wang, Zhongkai Zhao, Faming Wu, Zhengxuan Jiang, Weihao Tan, Heyuan Yao, Shi Yan, Xiangyang Li, Yitao Liang, Yujia Qin, Guang Shi
Preprint | [paper] | [website]

StarDojo: Benchmarking Open-Ended Behaviors of Agentic Multimodal LLMs in Production–Living Simulations with Stardew Valley
Weihao Tan*, Changjiu Jiang*, Yu Duan*, Mingcong Lei, Li JiaGeng, Yitian Hong, Xinrun Wang, Bo An
Preprint | [paper] | [website]

Cradle: Empowering Foundation Agents Towards General Computer Control
Weihao Tan*, Wentao Zhang*, Xinrun Xu*, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, Yujie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang^†, Börje F. Karlsson^†, Bo An, Shuicheng Yan, Zongqing Lu
ICML 2025 | [paper] | [website]

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng, Weihao Tan, Zhiyi Lyu, Longtao Zheng, Haiyang Xu, Ming Yan, Fei Huang, Bo An
ICML 2025 | [paper]

True Knowledge Comes from Practice: Aligning Large Language Models with Embodied Environments via Reinforcement Learning
Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An
ICLR 2024 | [paper]

Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
Yuchen Xiao, Weihao Tan, Christopher Amato
NeurIPS 2022 | [paper]

On Optimizing Interventions in Shared Autonomy
Weihao Tan*, David Koleczek*, Siddhant Pradhan*, Nicholas Perello, Vivek Chettiar, Nan Ma, Aaslesha Rajaram, Vishal Rohra, Soundar Srinivasan, H M Sajjad Hossain^†, Yash Chandak^†
AAAI 2022 | [paper]

Strategy and Benchmark for Converting Deep Q-Networks to Event-Driven Spiking Neural Networks
Weihao Tan, Devdhar Patel, Robert Kozma
AAAI 2021 | [paper]

MUSEFood: Multi-sensor-based Food Volume Estimation on Smartphones
Junyi Gao*, Weihao Tan*, Liantao Ma, Yasha Wang and Wen Tang
IEEE UIC 2019 | [paper]

Journal Papers

Asynchronous Multi-agent Deep Reinforcement Learning under Partial Observability
Yuchen Xiao, Weihao Tan, Joshua Hoffman, Tian Xia, Christopher Amato
The International Journal of Robotics Research | [paper]
TWOSOME: An Efficient Online Framework to Align LLMs with Embodied Environments via Reinforcement Learning
Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An
International Journal of Artificial Intelligence and Robotics Research | [paper]
Optimization Methods for Improved Efficiency and Performance of Deep Q-Networks upon Conversion to Neuromorphic Population Platforms
Weihao Tan, Devdhar Patel, Robert Kozma
Knowledge-Based Systems | [paper]