Human perception and experience of time are strongly influenced by ongoing stimulation, memory of past experiences, and required task context. When paying attention to time, time experience seems to expand; when distracted, it seems to contract. When considering time based on memory, the experience may be different than what is in the moment, exemplified by sayings like “time flies when you're having fun.” Experience of time also depends on the content of perceptual experience—rapidly changing or complex perceptual scenes seem longer in duration than less dynamic ones. The complexity of interactions among attention, memory, and perceptual stimulation is a likely reason that an overarching theory of time perception has been difficult to achieve. Here, we introduce a model of perceptual processing and episodic memory that makes use of hierarchical predictive coding, short-term plasticity, spatiotemporal attention, and episodic memory formation and recall, and apply this model to the problem of human time perception. In an experiment with approximately 13,000 human participants, we investigated the effects of memory, cognitive load, and stimulus content on duration reports of dynamic natural scenes up to about 1 minute long. Using our model to generate duration estimates, we compared human and model performance. Model-based estimates replicated key qualitative biases, including differences by cognitive load (attention), scene type (stimulation), and whether the judgment was made based on current or remembered experience (memory). Our work provides a comprehensive model of human time perception and a foundation for exploring the computational basis of episodic memory within a hierarchical predictive coding framework.