Abstract
Image sentiment influences visual perception. Emotion-eliciting stimuli such as happy faces and poisonous snakes are generally prioritized in human attention. However, little research has evaluated the interrelationships of image sentiment and visual saliency. In this paper, we present the first study to focus on the relation between emotional properties of an image and visual attention. We first create the EMOtional attention dataset (EMOd). It is a diverse set of emotion-eliciting images, and each image has (1) eye-tracking data collected from 16 subjects, (2) intensive image context labels including object contours, object sentiments, object semantic category, and high-level perceptual attributes such as image aesthetics and elicited emotions. We perform extensive analyses on EMOd to identify how image sentiment relates to human attention. We discover an emotion prioritization effect: for our images, emotion-eliciting content attracts human attention strongly, but such advantage diminishes dramatically after initial fixation. Aiming to model the human emotion prioritization computationally, we design a deep neural network for saliency prediction, which includes a novel subnetwork that learns the spatial and semantic context of the image scene. The proposed network outperforms the state-of-the-art on three benchmark datasets, by effectively capturing the relative importance of human attention within an image.
Resources
Papers:
S. Fan, Z. Shen, M. Jiang, B. Koenig, J. Xu, M. Kankanhali, Q.Zhao, "Emotional Attention: A Study of Image Sentiment and Visual Attention ", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (Spotlight oral, acceptance rate: 6.6%). [pdf] [supplementary] [code] [dataset]
S. Fan, M. Jiang, J. Xu, B. Koenig, Y. Cheng, M. Kankanhali, Q.Zhao, "A Correlational Study Between Human Attention and High-level Image Perception", Journal of Vision, 2017. [pdf]
Data:
Image Stimuli: 1019 Images (82.3 MB)
Fixation Maps (both continuous and binary): 1019 Images (13.7 MB)
Raw Eye-tracking Data: Matlab MAT (2.3 MB)
Labelled Object Contours and Attributes: Matlab MAT (776 KB)
Image-level Annotation: Matlab MAT (64 KB)
Code:
CNN model for saliency prediction.
Jupyter Notebook Code: Download
Model: Download
Code for metrics and evaluation (Python implementation of the code for metrics and evaluation on MIT Saliency Benchmark): Download
Sample Images from EMOtional Attention Dataset (EMOd)
Saliency prediction by the proposed CASNet