When we encounter a new word, there are often multiple objects that the word might refer to [1]. Nonetheless, because names for concrete nouns are constant, we are able to learn them across successive encounters [2, 3]. This form of “cross-situational” learning may result from either associative mechanisms that gradually accumulate evidence for each word-object association [4, 5] or rapid propose-but-verify (PbV) mechanisms where only one hypothesized referent is stored for each word, which is either subsequently verified or rejected [6, 7]. Using model-based representation similarity analyses of fMRI data acquired during learning, we find evidence for learning mediated by a PbV mechanism. This learning may be underpinned by rapid pattern-separation processes in the hippocampus. Our findings shed light on the psychological and neural processes that support word learning, suggesting that adults rely on their episodic memory to track a limited number of word-object associations