Recurrent World Model with Tokenized Latent States

Abstract

We introduce a new architecture, TokenWM, that maintains the recurrent nature of state-space models while incorporating tokenized latent states and a memory-augmented attention mechanism to improve modeling capacity in complex environments. The preliminary results on LIBERO benchmarks demonstrate that the new architecture is more favorable to complex tasks than the popular RSSM architecture. We believe TokenWM introduces a new design paradigm for recurrent world models, enabling more expressive and scalable decision-making in complex environments.

Publication
Preprint
Guangyao Zhai
Guangyao Zhai

(翟光耀)

Related