I am a PhD student at Nanyang Technological University, advised by Prof. Bo An. Currently, I am leading the Lumine project at ByteDance Seed. Feel free to contact me if you'd like to discuss the present and future of game agents.

My research interests lie in reinforcement learning (RL), generative models, and AI agents. Artificial general intelligence (AGI) has been my dream all along. I believe RL is the most potential way to achieve it and game is the perfect incubator of AGI. My goal is to design general intelligent agents outperforming humans across all tasks in cyberspace.

When I am not doing research, I enjoy watching animations and playing games (especially miHoYo's games). An interesting coincidence is that my current advisor's name is almost the same as Amber's, the first local guide and good partner in Genshin Impact, in Chinese.

Publications

I can also be found on Google Scholar.

(“*” indicates equal/core contribution. "†" indicates equal advising.)

Conference Papers

  1. Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
    Weihao Tan*, Xiangyang Li*, Yunhao Fang*, Heyuan Yao*, Shi Yan*, Hao Luo*, Tenglong Ao, Huihui Li, Hongbin Ren, Bairen Yi, Yujia Qin, Bo An, Libin Liu, Guang Shi
    Preprint | [paper] | [website]

  2. Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents
    Zihao Wang, Xujing Li, Yining Ye, Junjie Fang, Haoming Wang, Longxiang Liu, Shihao Liang, Junting Lu, Zhiyong Wu, Jiazhan Feng, Wanjun Zhong, Zili Li, Yu Wang, Yu Miao, Bo Zhou, Yuanfan Li, Hao Wang, Zhongkai Zhao, Faming Wu, Zhengxuan Jiang, Weihao Tan, Heyuan Yao, Shi Yan, Xiangyang Li, Yitao Liang, Yujia Qin, Guang Shi
    Preprint | [paper] | [website]

  3. StarDojo: Benchmarking Open-Ended Behaviors of Agentic Multimodal LLMs in Production–Living Simulations with Stardew Valley
    Weihao Tan*, Changjiu Jiang*, Yu Duan*, Mingcong Lei, Li JiaGeng, Yitian Hong, Xinrun Wang, Bo An
    Preprint | [paper] | [website]

  4. Cradle: Empowering Foundation Agents Towards General Computer Control
    Weihao Tan*, Wentao Zhang*, Xinrun Xu*, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, Yujie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson, Bo An, Shuicheng Yan, Zongqing Lu
    ICML 2025 | [paper] | [website]

  5. Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
    Lang Feng, Weihao Tan, Zhiyi Lyu, Longtao Zheng, Haiyang Xu, Ming Yan, Fei Huang, Bo An
    ICML 2025 | [paper]

  6. True Knowledge Comes from Practice: Aligning Large Language Models with Embodied Environments via Reinforcement Learning
    Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An
    ICLR 2024 | [paper]

  7. Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
    Yuchen Xiao, Weihao Tan, Christopher Amato
    NeurIPS 2022 | [paper]

  8. On Optimizing Interventions in Shared Autonomy
    Weihao Tan*, David Koleczek*, Siddhant Pradhan*, Nicholas Perello, Vivek Chettiar, Nan Ma, Aaslesha Rajaram, Vishal Rohra, Soundar Srinivasan, H M Sajjad Hossain, Yash Chandak
    AAAI 2022 | [paper]

  9. Strategy and Benchmark for Converting Deep Q-Networks to Event-Driven Spiking Neural Networks
    Weihao Tan, Devdhar Patel, Robert Kozma
    AAAI 2021 | [paper]

  10. MUSEFood: Multi-sensor-based Food Volume Estimation on Smartphones
    Junyi Gao*, Weihao Tan*, Liantao Ma, Yasha Wang and Wen Tang
    IEEE UIC 2019 | [paper]

Journal Papers

  1. Asynchronous multi-agent deep reinforcement learning under partial observability
    Yuchen Xiao, Weihao Tan, Joshua Hoffman, Tian Xia, Christopher Amato
    The International Journal of Robotics Research | [paper]
  2. TWOSOME: An Efficient Online Framework to Align LLMs with Embodied Environments via Reinforcement Learning
    Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An
    International Journal of Artificial Intelligence and Robotics Research | [paper]
  3. Optimization Methods for Improved Efficiency and Performance of Deep Q-Networks upon Conversion to Neuromorphic Population Platforms
    Weihao Tan, Devdhar Patel, Robert Kozma
    Knowledge-Based Systems | [paper]