skip to main content
research-article

Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart

Published: 01 August 2015 Publication History

Abstract

A novel saliency detection algorithm for video sequences based on the random walk with restart (RWR) is proposed in this paper. We adopt RWR to detect spatially and temporally salient regions. More specifically, we first find a temporal saliency distribution using the features of motion distinctiveness, temporal consistency, and abrupt change. Among them, the motion distinctiveness is derived by comparing the motion profiles of image patches. Then, we employ the temporal saliency distribution as a restarting distribution of the random walker. In addition, we design the transition probability matrix for the walker using the spatial features of intensity, color, and compactness. Finally, we estimate the spatiotemporal saliency distribution by finding the steady-state distribution of the walker. The proposed algorithm detects foreground salient objects faithfully, while suppressing cluttered backgrounds effectively, by incorporating the spatial transition matrix and the temporal restarting distribution systematically. Experimental results on various video sequences demonstrate that the proposed algorithm outperforms conventional saliency detection algorithms qualitatively and quantitatively.

References

[1]
J.-S. Kim, H. Kim, J.-Y. Sim, C.-S. Kim, and S.-U. Lee, “Video saliency detection based on random walk with restart,” in Proc. 20th IEEE ICIP, Sep. 2013, pp. 2465–2469.
[2]
D. Walther and C. Koch, “Modeling attention to salient proto-objects,” Neural Netw., vol. 19, no. 9, pp. 1395–1407, 2006.
[3]
T. H. Kim, K. M. Lee, and S. U. Lee, “Generative image segmentation using random walks with restart,” in Proc. ECCV, 2008, pp. 264–275.
[4]
C. Guo and L. Zhang, “A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression,” IEEE Trans. Image Process., vol. 19, no. 1, pp. 185–198, Jan. 2010.
[5]
J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang, “Manifold-ranking based image retrieval,” in Proc. ACM Int. Conf. Multimedia, 2004, pp. 9–16.
[6]
T. Lu, Z. Yuan, Y. Huang, D. Wu, and H. Yu, “Video retargeting with nonlinear spatial-temporal saliency fusion,” in Proc. IEEE ICIP, Sep. 2010, pp. 1801–1804.
[7]
R. P. N. Rao, G. J. Zelinsky, M. M. Hayhoe, and D. H. Ballard, “Eye movements in iconic visual search,” Vis. Res., vol. 42, no. 11, pp. 1447–1463, Nov. 2002.
[8]
V. Navalpakkam and L. Itti, “Top–down attention selection is fine grained,” J. Vis., vol. 6, no. 11, pp. 1180–1193, Oct. 2006.
[9]
L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998.
[10]
Y.-F. Ma and H.-J. Zhang, “Contrast-based image attention analysis by using fuzzy growing,” in Proc. 11th ACM Int. Conf. Multimedia, Nov. 2003, pp. 374–381.
[11]
R. Achanta, S. Hemami, F. Estrada, and S. Süsstrunk, “Frequency-tuned salient region detection,” in Proc. IEEE CVPR, Jun. 2009, pp. 1597–1604.
[12]
M.-M. Cheng, G.-X. Zhang, N. J. Mitra, X. Haung, and S.-M. Hu, “Global contrast based salient region detection,” in Proc. IEEE CVPR, Jun. 2011, pp. 409–416.
[13]
S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-aware saliency detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 10, pp. 1915–1926, Oct. 2012.
[14]
Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” in Proc. IEEE CVPR, Jun. 2013, pp. 1155–1162.
[15]
J. Harel, C. Koch, and P. Perona, “Graph-based visual saliency,” in Proc. Adv. Neural Inf. Process. Syst., 2006, pp. 545–552.
[16]
V. Gopalakrishnan, Y. Hu, and D. Rajan, “Random walks on graphs for salient object detection in images,” IEEE Trans. Image Process., vol. 19, no. 12, pp. 3232–3242, Dec. 2010.
[17]
J.-S. Kim, J.-Y. Sim, and C.-S. Kim, “Multiscale saliency detection using random walk with restart,” IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 2, pp. 198–210, Jun. 2013.
[18]
C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proc. IEEE CVPR, Jun. 2013, pp. 3166–3173.
[19]
X. Hou and L. Zhang, “Saliency detection: A spectral residual approach,” in Proc. IEEE CVPR, Jun. 2007, pp. 1–8.
[20]
B. Schauerte and R. Stiefelhagen, “Quaternion-based spectral saliency detection for eye fixation prediction,” in Proc. ECCV, 2012, pp. 116–129.
[21]
J. Li and W. Gao, Visual Saliency Computation: A Machine Learning Perspective. New York, NY, USA: Springer, 2014.
[22]
C. Jia, F. Hou, and L. Duan, “Visual saliency based on local and global features in the spatial domain,” Int. J. Comput. Sci., vol. 10, no. 3, pp. 713–719, 2013.
[23]
J. Yu, M. Wang, and D. Tao, “Semisupervised multiview distance metric learning for cartoon synthesis,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4636–4648, Nov. 2012.
[24]
J. Yu, D. Tao, and M. Wang, “Adaptive hypergraph learning and its application in image classification,” IEEE Trans. Image Process., vol. 21, no. 7, pp. 3262–3272, Jul. 2012.
[25]
L. da Fontoura Costa. (2007). “Visual saliency and attention as random walks on complex networks.” [Online]. Available: http://arxiv.org/abs/physics/0603025v2
[26]
E. H. Adelson and J. R. Bergen, “Spatiotemporal energy models for the perception of motion,” J. Opt. Soc. Amer. A, vol. 2, no. 2, pp. 284–299, 1985.
[27]
O. Boiman and M. Irani, “Detecting irregularities in images and in video,” Int. J. Comput. Vis., vol. 74, no. 1, pp. 17–31, 2007.
[28]
X. Hou and L. Zhang, “Dynamic visual attention: Searching for coding length increments,” in Proc. Adv. Neural Inf. Process. Syst., 2008, pp. 681–688.
[29]
L. Zhang, M. H. Tong, and G. W. Cottrell, “Sunday: Saliency using natural statistics for dynamic analysis of scenes,” in Proc. 31st Annu. Cognit. Sci. Conf., 2009, pp. 2944–2949.
[30]
Y. Xue, X. Guo, and X. Cao, “Motion saliency detection using low-rank and sparse decomposition,” in Proc. IEEE ICASSP, Mar. 2012, pp. 1485–1488.
[31]
H. J. Seo and P. Milanfar, “Static and space-time visual saliency detection by self-resemblance,” J. Vis., vol. 9, no. 12, pp. 1–27, Nov. 2009.
[32]
Y. Li, Y. Zhou, L. Xu, X. Yang, and J. Yang, “Incremental sparse saliency detection,” in Proc. 16th IEEE ICIP, Nov. 2009, pp. 3093–3096.
[33]
Y. Li, Y. Zhou, J. Yan, Z. Niu, and J. Yang, “Visual saliency based on conditional entropy,” in Proc. Asian Conf. Comput. Vis., 2009, pp. 246–257.
[34]
L. Itti, N. Dhavale, and F. Pighin, “Realistic avatar eye and head animation using a neurobiological model of visual attention,” Proc. SPIE, vol. 5200, pp. 64–78, 2003.
[35]
Y. Zhai and M. Shah, “Visual attention detection in video sequences using spatiotemporal cues,” in Proc. ACM Int. Conf. Multimedia, 2006, pp. 815–824.
[36]
S. Marat, T. H. Phuoc, L. Granjon, N. Guyader, D. Pellerin, and A. Guerin-Dugue, “Modelling spatio-temporal saliency to predict gaze direction for short videos,” Int. J. Comput. Vis., vol. 82, no. 3, pp. 231–243, 2009.
[37]
J. Peng and Q. Xiaolin, “Keyframe-based video summary using visual attention clues,” IEEE Trans. Multimedia, vol. 17, no. 2, pp. 64–73, Apr./Jun. 2010.
[38]
X. Xiao, C. Xu, and Y. Rui, “Video based 3D reconstruction using spatio-temporal attention analysis,” in Proc. IEEE ICME, Jul. 2010, pp. 1091–1096.
[39]
W. Kim, C. Jung, and C. Kim, “Spatiotemporal saliency detection and its applications in static and dynamic scenes,” IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 4, pp. 446–456, Apr. 2011.
[40]
W. Hu, T. Tan, L. Wang, and S. Maybank, “A survey on visual surveillance of object motion and behaviors,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 34, no. 3, pp. 334–352, Aug. 2004.
[41]
J. B. Kim and H. J. Kim, “Efficient region-based motion segmentation for a video monitoring system,” Pattern Recognit. Lett., vol. 24, nos. 1–3, pp. 113–128, 2003.
[42]
B. Han and B. Zhou, “High speed visual saliency computation on GPU,” in Proc. IEEE ICIP, Sep./Oct. 2007, pp. I-361–I-364.
[43]
A. Borji, D. N. Sihite, and L. Itti, “Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study,” IEEE Trans. Image Process., vol. 22, no. 1, pp. 55–69, Jan. 2013.
[44]
C. M. Harris, “The ethology of saccades: A non-cognitive model,” Biol. Cybern., vol. 60, no. 6, pp. 401–410, 1989.
[45]
H. Samelson, “On the Perron–Frobenius theorem,” Michigan Math. J., vol. 4, no. 1, pp. 57–59, 1957.
[46]
T. H. Haveliwala, “Topic-sensitive PageRank,” in Proc. 11th Int. Conf. World Wide Web, 2002, pp. 517–526.
[47]
L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank citation ranking: Bringing order to the Web,” Stanford InfoLab., Stanford, CA, USA, Tech. Rep. 1999-66, 1999.
[48]
J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu, “Automatic multimedia cross-modal correlation discovery,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2004, pp. 653–658.
[49]
T. H. Kim, K. M. Lee, and S. U. Lee, “Edge-preserving colorization using data-driven random walks with restart,” in Proc. IEEE ICIP, Nov. 2009, pp. 1661–1664.
[50]
B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos, “Using ghost edges for classification in sparsely labeled networks,” in Proc. 14th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2008, pp. 256–264.
[51]
J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000.
[52]
A. MacKay and J. F. Juola, “Are spatial and temporal attention independent?” Perception Psychophys., vol. 69, no. 6, pp. 972–979, 2007.
[53]
A. A. Goshtasby, Image Registration. Berlin, Germany: Springer-Verlag, 2012.
[54]
J. R. Norris, Markov Chains. Cambridge, U.K.: Cambridge Univ. Press, 1998.
[55]
M. Ebner, Color Constancy. New York, NY, USA: Wiley, 2007.
[56]
MPEG Database. [Online]. Available: http://media.xiph.org/video/derf/, accessed Apr. 2015.
[58]
K. Akamine, K. Fukuchi, A. Kimura, and S. Takagi, “Fully automatic extraction of salient objects from videos in near real time,” Comput. J., vol. 55, no. 1, pp. 3–14, 2012.
[59]
MCL Database. [Online]. Available: http://mcl.korea.ac.kr/database/saliency/, accessed Apr. 2015.
[60]
R. J. Peters, A. Iyer, L. Itti, and C. Koch, “Components of bottom-up gaze allocation in natural images,” Vis. Res., vol. 45, no. 18, pp. 2397–2416, 2005.
[61]
J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a receiver operating characteristic (ROC) curve,” Radiology, vol. 143, no. 1, pp. 29–36, 1982.

Cited By

View all

Index Terms

  1. Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image IEEE Transactions on Image Processing
            IEEE Transactions on Image Processing  Volume 24, Issue 8
            Aug. 2015
            315 pages

            Publisher

            IEEE Press

            Publication History

            Published: 01 August 2015

            Author Tags

            1. motion profile
            2. Saliency detection
            3. video saliency
            4. random walk with restart
            5. spatiotemporal feature

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 18 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media