A Perceptually Based Comparison of Image Similarity Metrics

Pawan Sinha
Richard Russell, Gettysburg College


The assessment of how well one image matches another forms a critical component both of models of human visual processing and of many image analysis systems. Two of the most commonly used norms for quantifying image similarity are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric, better than the other, captures the perceptual notion of image similarity. This can be used to derive inferences regarding similarity criteria the human visual system uses, as well as to evaluate and design metrics for use in image-analysis applications. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created by vector quanization. In both conditions, the participants showed a small but consistent preference for images matched with the L1 metric. These results sugges that, in the dominian of natural images of the kind we have used, the l! metric may better capture human notions of image similarity.