Locale-Conditioned World Models for Multilingual Physical Commonsense Reasoning
Abstract
Physical commonsense reasoning links language to expectations about objects, actions, and outcomes in everyday environments. As language technologies expand into multilingual deployment, the same prompt can implicitly reference different tools, materials, cuisines, climates, or social routines, shifting what is physically plausible. This paper argues that scalable robustness requires treating locale not as a nuisance domain shift but as a structured latent variable that modulates affordances, typicality, and measurement noise. We introduce a locale-conditioned world-modeling framework that couples (i) a semantic parser from text to a compact scene-and-action representation, (ii) an affordance-centric dynamics model that predicts outcomes under interventions, and (iii) a hierarchical prior over locale factors that supports partial pooling across related language communities while preserving culturally specific invariants. The technical contribution is a unified probabilistic formulation in which locale influences both generative physics surrogates and evaluation difficulty, enabling principled calibration, uncertainty decomposition, and counterfactual consistency checks. We derive a variational learning objective that separates invariant physical regularities from locale-dependent priors over objects and practices, and we propose reliability diagnostics based on measurement invariance and differential item functioning to detect when apparent accuracy improvements reflect exploitation of locale artifacts rather than stronger physical reasoning. The framework is intended to be model-agnostic, applicable to LLM-based systems augmented with structured state trackers and retrieval. We discuss how the approach supports robust cross-locale generalization, interpretable failure analysis, and safer deployment in low-resource languages.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 authors

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.