Computer Sciences Dept.

A Bayesian model for image sense ambiguity in pictorial communication systems

Jake Rosin, Andrew B. Goldberg, Xiaojin Zhu, Charles Dyer

Pictorial communication systems use synthesized pictures, rather than text, to communicate with users. Because such systems depend on images to convey meanings, it is critical to understand how a human user perceives the image meaning (sense). This paper offers an empirical and theoretical study of how humans perceive image senses. We conduct a user study with 113 users to elicit their perceived senses on 400 image sets, from which we discover widespread image sense ambiguities. We examine how the number of images shown relates to sense ambiguity and discover several significant patterns. We then propose a Bayesian model to explain human image perception behaviors, based on a novel random walk process on a WordNet-like sense hierarchy. Our model makes qualitative and quantitative predictions that largely agree with our observations of human perception. It can explain the “basic level” phenomenon known in psychology, and suggests a method for image sense disambiguation in pictorial communication systems.

Download this report (PDF)

Return to tech report index

Computer Science | UW Home