# See and Think: Embodied Agent in Virtual Environment Zhonghan Zhao1\* , Wenhao Chai\*2❤, Xuan Wang1\*, Li Boyi1, Shengyu Hao1, Shidong Cao1, Tian Ye3, Jenq-Neng Hwang2, Gaoang Wang1✉ 1 Zhejiang University 2 University of Washington 3 Hong Kong University of Science and Technology (GZ) *Equal contribution Project lead Corresponding author ![STEVE, named after the protagonist of the game Minecraft, is our proposed framework aims to build an embodied agent based on the vision model and LLMs within an open world.](https://rese1f.github.io/STEVE/static/images/teaser.png) STEVE, named after the protagonist of the game Minecraft, is our proposed framework aims to build an embodied agent based on the vision model and LLMs within an open world. Link: [See and Think: Embodied Agent in Virtual Environment](https://rese1f.github.io/STEVE/)