MBZUAI’s Institute of Foundation Models has introduced PAN, a research model that examines how AI systems might eventually interpret and simulate changes in the physical world over extended periods. The announcement positions PAN as a step toward interactive world models—systems that not only generate visuals but also maintain internal logic, memory, and continuity as scenes evolve.
Unlike typical video-generation tools that produce short, disconnected clips, PAN is designed to keep track of what exists in a scene and how it changes. The model can follow natural language instructions—such as walking toward a landmark or navigating a particular environment—and produce sequences that maintain coherent motion and consistent visual details over time. This emphasis on temporal reasoning aligns with broader research across the field, where the long-term goal is to create models that can anticipate outcomes rather than simply generate surface-level imagery.
PAN is built on a Generative Latent Prediction framework, which separates the internal simulation of events from the final visual rendering. First, the model constructs a latent state representing the objects, dynamics, and context of the scene. It then decodes that state into a short video segment. By repeating this cycle, PAN preserves causal structure while extending sequences beyond the short horizons typical of earlier systems. This approach mirrors trends in model-based reinforcement learning and robotics, where latent world models are used to test actions before executing them.
Early benchmarking suggests that PAN performs competitively among open-source systems in three areas: fidelity of simulated actions, long-range forecasting, and reasoning about how actions influence future states. These capabilities may support research in robotics, autonomous navigation, and decision-support systems, where anticipating downstream effects is critical. While the model remains part of the institute’s research pipeline, the work reflects an ongoing effort within the AI community to move from static content generation toward systems capable of adaptive planning.
Developed at the Institute of Foundation Models with contributors across Abu Dhabi, Paris, and Silicon Valley, PAN fits within the group’s broader goal of producing open and scientifically grounded models. The institute frames this release as an example of cross-regional collaboration meant to advance responsible AI research and enable others to build on the work.
Additional material—including a blog post, a research paper, and further technical details—is available through the project’s site and IFM’s communication channels.

