Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Zhang, Xiao-Qin; Jiang, Run-Hua; Fan, Chen-Xiang; Tong, Tian-Yu; Wang, Tao; Huang, Peng-Cheng

doi:10.1007/s11633-020-1274-8

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Review
Open access
Published: 04 March 2021

Volume 18, pages 311–333, (2021)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Automation and Computing Aims and scope Submit manuscript

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Download PDF

2270 Accesses
2 Altmetric
Explore all metrics

Abstract

Recently, deep learning has achieved great success in visual tracking tasks, particularly in single-object tracking. This paper provides a comprehensive review of state-of-the-art single-object tracking algorithms based on deep learning. First, we introduce basic knowledge of deep visual tracking, including fundamental concepts, existing algorithms, and previous reviews. Second, we briefly review existing deep learning methods by categorizing them into data-invariant and data-adaptive methods based on whether they can dynamically change their model parameters or architectures. Then, we conclude with the general components of deep trackers. In this way, we systematically analyze the novelties of several recently proposed deep trackers. Thereafter, popular datasets such as Object Tracking Benchmark (OTB) and Visual Object Tracking (VOT) are discussed, along with the performances of several deep trackers. Finally, based on observations and experimental results, we discuss three different characteristics of deep trackers, i.e., the relationships between their general components, exploration of more effective tracking frameworks, and interpretability of their motion estimation components.

Article PDF

A systematic survey on recent deep learning-based approaches to multi-object tracking

Article 26 September 2023

Handcrafted and Deep Trackers: A Survey

A survey on online learning for visual tracking

Article 15 May 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

D. Comaniciu, V. Ramesh, P. Meer. Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564–577, 2003. DOI: https://doi.org/10.1109/TPAMI.2003.1195991.
Article Google Scholar
P. Perez, C. Hue, J. Vermaak, M. Gangnet. Color-based probabilistic tracking. In Proceedings of the 7th European Conference on Computer Vision, Springer, Copenhagen, Denmark, pp. 661–675, 2002. DOI: https://doi.org/10.1007/3_540-47969-4_44.
Google Scholar
D. Comaniciu, P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002. DOI: https://doi.org/10.1109/34.1000236.
Article Google Scholar
M. Isard, A. Blake. CONDENSATION-conditional density propagation for visual tracking. International Journal of Computer Vision, vol. 29, no. 1, pp. 5–28, 1998. DOI: https://doi.org/10.1023/A:1008078328650.
Article Google Scholar
J. Kwon, K. M. Lee. Tracking by sampling and integratingmultiple trackers. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 7, pp. 1428–1441, 2014. DOI: https://doi.org/10.1109/TPAMI.2013.213.
Article MathSciNet Google Scholar
A. Adam, E. Rivlin, I. Shimshoni. Robust fragments-based tracking using the integral histogram. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, New York, USA, pp. 798–805, 2006. DOI: https://doi.org/10.1109/CVPR.2006.256.
Google Scholar
D. A. Ross, J. Lim, R. S. Lin, M. H. Yang. Incremental learning for robust visual tracking. International Journal of Computer Vision, vol. 77, no. 1–3, pp. 125–141, 2008. DOI: https://doi.org/10.1007/s11263-007-0075-7.
Article Google Scholar
X. Mei, H. B. Ling. Robust visual tracking and vehicle classification via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 11, pp. 2259–2272, 2011. DOI: https://doi.org/10.1109/TPAMI.2011.66.
Article Google Scholar
D. Wang, H. C. Lu, M. H. Yang. Online object tracking with sparse prototypes. IEEE Transactions on Image Processing, vol. 22, no. 1, pp. 314–325, 2013. DOI: https://doi.org/10.1109/TIP.2012.2202677.
Article MathSciNet MATH Google Scholar
H. Grabner, H. Bischof. On-line boosting and vision. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, New York, USA, pp. 260–267, 2006. DOI: https://doi.org/10.1109/CVPR.2006.215.
Google Scholar
S. Avidan. Ensemble tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 2, pp. 261–271, 2007. DOI: https://doi.org/10.1109/TPAMI.2007.35.
Article Google Scholar
S. Avidan. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 1064–1072, 2004. DOI: https://doi.org/10.1109/TPAMI.2004.53.
Article Google Scholar
A. Saffari, C. Leistner, J. Santner, M. Godec, H. Bischof. On-line random forests. In Proceedings of the 12th IEEE International Conference on Computer Vision Workshops, IEEE, Kyoto, Japan, pp. 1393–1400, 2009. DOI: https://doi.org/10.1109/ICCVW.2009.5457447.
Google Scholar
B. Babenko, M. H. Yang, S. Belongie. Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 8, pp. 1619–1632, 2010. DOI: https://doi.org/10.1109/TPAMI.2010.226.
Article Google Scholar
N. Jiang, W. Y. Liu, Y. Wu. Learning adaptive metric for robust visual tracking. IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2288–2300, 2011. DOI: https://doi.org/10.1109/TIP.2011.2114895.
Article MathSciNet MATH Google Scholar
Y. LeCun, Y. Bengio, G. Hinton. Deep learning. Nature, vol. 521, no. 7553, pp. 436–444, 2015. DOI: https://doi.org/10.1038/nature14539.
Article Google Scholar
R. Girshick, J. Donahue, T. Darrell, J. Malik. Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142–158, 2016. DOI: https://doi.org/10.1109/TPAMI.2015.2437384.
Article Google Scholar
X. Q. Zhang, R. H. Jiang, T. Wang, P. C. Huang, L. Zhao. Attention-based interpolation network for video deblurring. Neurocomputing, 2020. DOI: https://doi.org/10.1016/j.neucom.2020.04.147.
S. Kim, T. Hori, S. Watanabe. Joint CTC-attention based end-to-end speech recognition using multi-task learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, New Orleans, USA, pp. 4835–4839, 2017. DOI: https://doi.org/10.1109/ICASSP.2017.7953075.
Google Scholar
Z. Z. Wu, C. Valentini-Botinhao, O. Watts, S. King. Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Brisbane, Australia, pp. 4460–4464, 2015. DOI: https://doi.org/10.1109/ICASSP.2015.7178814.
Google Scholar
D. Bahdanau, K. Cho, Y. Bengio. Neural machine translation by jointly learning to align and translate. [Online], Available: https://arxiv.org/abs/1409.0473, 2014.
O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, G. Hinton. Grammar as a foreign language. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Cambridge, USA, pp. 2773–2781, 2015.
L. J. Wang, W. L. Ouyang, X. G. Wang, H. C. Lu. Visual tracking with fully convolutional networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 3119–3127, 2015. DOI: https://doi.org/10.1109/ICCV.2015.357.
Google Scholar
H. Nam, B. Han. Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 4293–4302, 2016. DOI: https://doi.org/10.1109/CVPR.2016.465.
Google Scholar
L. J. Wang, W. L. Ouyang, X. G. Wang, H. C. Lu. STCT: Sequentially training convolutional networks for visual tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1373–1381, 2016. DOI: https://doi.org/10.1109/CVPR.2016.153.
Google Scholar
R. Tao, E. Gavves, A. W. M. Smeulders. Siamese instance search for tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1420–1429, 2016. DOI: https://doi.org/10.1109/CVPR.2016.158.
Google Scholar
Y. Wu, J. Lim, M. H. Yang. Online object tracking: A benchmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Portland, USA, pp. 2411–2418, 2013. DOI: https://doi.org/10.1109/CVPR.2013.312.
Google Scholar
M. Kristan, R. Pflugfelder, A. Leonardis, J. Matas, F. Porikli, L. Cehovin, G. Nebehay, G. Fernandez, T. Vojir. The VOT2013 challenge: Overview and additional results. In Proceedings of the 19th Computer Vision Winter Workshop, Krtiny, Czech Republic, 2014.
X. Li, W. M. Hu, C. H. Shen, Z. F. Zhang, A. Dick, A. Van Den Hengel. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology, vol. 4, no. 4, Article number 58, 2013. DOI: https://doi.org/10.1145/2508037.2508039.
H. C. Lu, P. X. Li, D. Wang. Visual object tracking: A survey. Pattern Recognition and Artificial Intelligence, vol. 31, no. 1, pp. 61–76, 2018. DOI: https://doi.org/10.16451/j.cnki.issn1003-6059.201801006. (in Chinese)
Google Scholar
P. X. Li, D. Wang, L. J. Wang, H. C. Lu. Deep visual tracking: Review and experimental comparison. Pattern Recognition, vol. 76, pp. 323–338, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.11.007.
Article Google Scholar
X. Li, Y. F. Zha, T. Z. Zhang, Z. Cui, W. M. Zuo, Z. Q. Hou, H. C. Lu, H. Z. Wang. Survey of visual object tracking algorithms based on deep learning. Journal of Image and Graphics, vol. 24, no. 12, pp. 2057–2080, 2019. (in Chinese)
Google Scholar
A. Brunetti, D. Buongiorno, G. F. Trotta, V. Bevilacqua. Computer vision and deep learning techniques for pedestrian detection and tracking: A survey. Neurocomputing, vol. 300, pp. 17–33, 2018. DOI: https://doi.org/10.1016/j.neucom.2018.01.092.
Article Google Scholar
R. Yao, G. S. Lin, S. X. Xia, J. Q. Zhao, Y. Zhou. Video object segmentation and tracking: A survey. ACM Transactions on Intelligent Systems and Technology, vol. 11, no. 4, Article number 36, 2020. DOI: https://doi.org/10.1145/3391743.
D. Ciregan, U. Meier, J. Schmidhuber. Multi-column deep neural networks for image classification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 3642–3649, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6248110.
Google Scholar
Z. Q. Zhao, P. Zheng, S. T. Xu, X. D. Wu. Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3212–3232, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2876865.
Article Google Scholar
J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 3431–3440, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298965.
Google Scholar
K. Zhang, W. M. Zuo, S. H. Gu, L. Zhang. Learning deep CNN denoiser prior for image restoration. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 3929–3938, 2017. DOI: https://doi.org/10.1109/CVPR.2017.300.
Google Scholar
X. O. Tang, X. B. Gao, J. Z. Liu, H. J. Zhang. A spatial-temporal approach for video caption detection and recognition. IEEE Transactions on Neural Networks, vol. 13, no. 4, pp. 961–971, 2002. DOI: https://doi.org/10.1109/TNN.2002.1021896.
Article Google Scholar
H. Geffner. Model-free, model-based, and general intelligence. In Proceedings of the 27th International Joint Conference on Artifícial Intelligence, AAAI Press, Stockholm, Sweden, pp. 10–17, 2018. DOI: https://doi.org/10.24963/ijcai.2018/2.
Google Scholar
T. Elsken, J. H. Metzen, F. Hutter. Neural architecture search: A survey. [Online], Available: https://arxiv.org/abs/1808.05377, 2018.
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
Google Scholar
G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4700–4708, 2017. DOI: https://doi.org/10.1109/CVPR.2017.243.
Google Scholar
C. Szegedy, S. Ioffe, V. Vanhoucke, A. A. Alemi. Inception-v4, inception-ResNet and the impact of residual connections on learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI Press, San Francisco, USA, pp. 4278–4284, 2016.
Google Scholar
C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 1–9, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298594.
Google Scholar
C. S. Brito, W. Gerstner. Nonlinear Hebbian learning as a unifying principle in receptive field formation. PLoS computational biology, vol. 12, no. 9, Article number e1005070, 2016.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2818–2826, 2016. DOI: https://doi.org/10.1109/CVPR.2016.308.
Google Scholar
Z. Y. Huo, B. Gu, H. Huang. Training neural networks using features replay. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, ACM, Red Hook, USA, pp. 6660–6669, 2018.
Google Scholar
J. Jeong, J. Shin. Training CNNs with selective allocation of channels. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, pp. 3080–3090, 2019.
S. Y. Qiao, Z. S. Zhang, W. Shen, B. Wang, A. Yuille. Gradually updated neural networks for large-scale image recognition. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 4188–4197, 2018.
K. Han, Y. H. Wang, Q. Tian, J. Y. Guo, C. J. Xu, C. Xu. GhostNet: More features from cheap operations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1580–1589, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00165.
Google Scholar
R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, W. Brendel. Image Net-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. [Online], Available: https://arxiv.org/abs/1811.12231, 2018.
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.
Google Scholar
J. Frankle, M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. [Online], Available: https://arxiv.org/abs/1803.03635, 2018.
J. You, J. Leskovec, K. He, S. Xie. Graph structure of neural networks. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, pp. 10881–10891, 2020.
C. H. Xie, Y. X. Wu, L. Van Der Maaten, A. L. Yuille, K. M. He. Feature denoising for improving adversarial robustness. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 501–509, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00059.
Google Scholar
S. Kanai, Y. Fujiwara, S. Iwamura. Preventing gradient explosions in gated recurrent units. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 435–444, 2017.
Y. Bengio, P. Simard, P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157–166, 1994. DOI: https://doi.org/10.1109/72.279181.
Article Google Scholar
T. Mikolov. Statistical language models based on neural networks. Presentation at Google, vol. 80, Article number 26, 2012.
R. Pascanu, T. Mikolov, Y. Bengio. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on International Conference on Machine Learning, Atlanta, USA, pp. 1310–1318, 2013.
S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.
Article Google Scholar
K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio. Learning phrase representations using RNN encoder-dec oder for statistical machine translation. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724–1734, 2014. DOI: https://doi.org/10.3115/v1/D14-1179.
R. Jozefowicz, W. Zaremba, I. Sutskever. An empirical exploration of recurrent network architectures. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, JMLR.org, Lille, France, pp. 2342–2350, 2015.
Google Scholar
G. B. Zhou, J. X. Wu, C. L. Zhang, Z. H. Zhou. Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, vol. 13, no. 3, pp. 226–234, 2016. DOI: https://doi.org/10.1007/s11633-016-1006-2.
Article Google Scholar
B. W. Du, H. Peng, S. Z. Wang, M. Z. A. Bhuiyan, L. H. Wang, Q. R. Gong, L. Liu, J. Li. Deep irregular convolutional residual LSTM for urban traffic passenger flows prediction. IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 3, pp. 972–985, 2020. DOI: https://doi.org/10.1109/TITS.2019.2900481.
Article Google Scholar
J. Martens. Deep learning via hessian-free optimization. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ACM, Madison, USA, pp. 735–742, 2010.
Google Scholar
J. Martens, I. Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proceedings of the 28th International Conference on International Conference on Machine Learning, Madison, USA, pp. 1033–1040, 2011.
I. Sutskever, J. Martens, G. Dahl, G. Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on International Conference on Machine Learning, Atlanta, USA, pp. 1139–1147, 2013.
M. Schuster, K. K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, 1997. DOI: https://doi.org/10.1109/78.650093.
Article Google Scholar
M. Sundermeyer, T. Alkhouli, J. Wuebker, H. Ney. Translation modeling with bidirectional recurrent neural networks. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 14–25, 2014.
R. H. Jiang, L. Zhao, T. Wang, J. X. Wang, X. Q. Zhang. Video deblurring via temporally and spatially variant recurrent neural network. IEEE Access, vol. 8, pp. 7587–7597, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2962505.
Article Google Scholar
S. F. Liu, J. S. Pan, M. H. Yang. Learning recursive filters for low-level vision via a hybrid neural network. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 560–576, 2016. DOI: https://doi.org/10.1007/978-3-319-46493-0_34.
Google Scholar
Z. H. Li, L. N. Yao, X. Q. Zhang, X. Z. Wang, S. Kanhere, H. Z. Zhang. Zero-shot object detection with textual descriptions. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, USA, vol. 33, pp. 8690–8697, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33018690.
Article Google Scholar
G. Papandreou, L. C. Chen, K. Murphy, A. L. Yuille. Weakly-and semi-supervised learning of a DCNN for semantic image segmentation. [Online], Available: https://arxiv.org/abs/1502.02734, 2015.
G. H. Ning, Z. Zhang, C. Huang, X. B. Ren, H. H. Wang, C. H. Cai, Z. H. He. Spatially supervised recurrent convolutional neural networks for visual object tracking. In Proceedings of IEEE International Symposium on Circuits and Systems, IEEE, Baltimore, USA, pp. 1–4, 2017. DOI: https://doi.org/10.1109/ISCAS.2017.8050867.
Google Scholar
Y. D. Chu, J. T. Fei, S. X. Hou. Adaptive global slidingmode control for dynamic systems using double hidden layer recurrent neural network structure. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 4, pp. 1297–1309, 2020. DOI: https://doi.org/10.1109/TNNLS.2019.2919676.
Article MathSciNet Google Scholar
R. Wang, S. M. Pizer, J. M. Frahm. Recurrent neural network for (un-) supervised learning of monocular video visual odometry and depth. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 5555–5564, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00570.
Google Scholar
Z. H. Wu, S. R. Pan, F. W. Chen, G. D. Long, C. Q. Zhang, P. S. Yu. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4–24, 2021. DOI: https://doi.org/10.1109/TNNLS.2020.2978386.
Article MathSciNet Google Scholar
M. Gori, G. Monfardini, F. Scarselli. A new model for learning in graph domains. In Proceedings of IEEE International Joint Conference on Neural Networks, IEEE, Montreal, Canada, pp. 729–734, 2005. DOI: https://doi.org/10.1109/IJCNN.2005.1555942.
Google Scholar
F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini. The graph neural network model. IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009. DOI: https://doi.org/10.1109/TNN.2008.2005605.
Article Google Scholar
C. Gallicchio, A. Micheli. Graph echo state networks. In Proceedings of International Joint Conference on Neural Networks, IEEE, Barcelona, Spain, 2010. DOI: https://doi.org/10.1109/IJCNN.2010.5596796.
MATH Google Scholar
J. Bruna, W. Zaremba, A. Szlam, Y. LeCun. Spectral networks and locally connected networks on graphs. In Proceedings of the 2nd International Conference on Learning Representations, Banff, Canada, 2013.
S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, K. M. Borgwardt. Graph kernels. The Journal of Machine Learning Research, vol. 11, no. 40, pp. 1201–1242, 2010.
MathSciNet MATH Google Scholar
T. Gartner, P. Flach, S. Wrobel. On graph kernels: Hardness results and efficient alternatives. In Proceedings of the 16th Annual Conference on Learning Theory and 7th Kernel Workshop on Learning Theory and Kernel Machines, Springer, Washington, USA, pp. 129–143, 2003. DOI: https://doi.org/10.1007/978-3-540-45167-9_11.
Chapter Google Scholar
M. Liang, B. Yang, R. Hu, Y. Chen, R. J. Liao, S. Feng, R. Urtasun. Learning lane graph representations for motion forecasting. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 541–556, 2020. DOI: https://doi.org/10.1007/978-3-030-58536-5_32.
Google Scholar
H. Y. Lee, L. Jiang, I. Essa, P. B. Le, H. F. Gong, M. H. Yang, W. L. Yang. Neural design network: Graphic layout generation with constraints. [Online], Available: https://arxiv.org/abs/1912.09421, 2019.
C. Q. Yu, Y. F. Liu, C. X. Gao, C. H. Shen, N. Sang. Representative graph neural network. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 379–396, 2020. DOI: https://doi.org/10.1007/978-3-030-58571-6_23.
Google Scholar
M. T. Luong, H. Pham, C. D. Manning. Effective approaches to attention-based neural machine translation. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp. 1412–1421, 2015. DOI: https://doi.org/10.18653/v1/D15-1166.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 5998–6008, 2017.
Z. H. Dai, Z. L. Yang, Y. M. Yang, J. Carbonell, Q. V. Le, R. Salakhutdinov. Transformer-XL: Attentive language models beyond a fixed-length context. [Online], Available: https://arxiv.org/abs/1901.02860, 2019.
P. Shaw, J. Uszkoreit, A. Vaswani. Self-attention with relative position representations. [Online], Available: https://arxiv.org/abs/1803.02155, 2018.
X. Q. Zhang, T. Wang, J. X. Wang, G. Y. Tang, L. Zhao. Pyramid channel-based feature attention network for image dehazing. Computer Vision and Image Understanding, vol. 197–198, Article number 103003, 2020. DOI: https://doi.org/10.1016/j.cviu.2020.103003.
K. Xu, J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, pp. 2048–2057, 2015.
T. Xu, P. C. Zhang, Q. Y. Huang, H. Zhang, Z. Gan, X. L. Huang, X. D. He. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1316–1324, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00143.
Google Scholar
H. Hu, J. Y. Gu, Z. Zhang, J. F. Dai, Y. C. Wei. Relation networks for object detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3588–3597, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00378.
Google Scholar
X. L. Wang, R. Girshick, A. Gupta, K. M. He. Non-local neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7794–7803, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00813.
Google Scholar
X. Z. Zhu, Y. J. Wang, J. F. Dai, L. Yuan, Y. C. Wei. Flow-guided feature aggregation for video object detection. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 408–417, 2017. DOI: https://doi.org/10.1109/ICCV.2017.52.
Google Scholar
F. Y. Xiao, Y. J. Lee. Video object detection with an aligned spatial-temporal memory. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 485–501, 2018. DOI: https://doi.org/10.1007/978-3-030-01237-3_30.
Google Scholar
X. H. Jiang, L. Zhang, M. L. Xu, T. Z. Zhang, P. Lv, B. Zhou, X. Yang, Y. W. Pang. Attention scaling for crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 4706–4715, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00476.
Google Scholar
H. Zhang, I. Goodfellow, D. Metaxas, A. Odena. Self-attention generative adversarial networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 7354–7363, 2019.
T. Dai, J. R. Cai, Y. B. Zhang, S. T. Xia, L. Zhang. Second-order attention network for single image superresolution. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11065–11074, 2019. DOI: https://doi.org/10.1109/CV-PR.2019.01132.
Google Scholar
B. Y. Chen, P. X. Li, C. Sun, D. Wang, G. Yang, H. C. Lu. Multi attention module for visual tracking. Pattern Recognition, vol. 87, pp. 80–93, 2019. DOI: https://doi.org/10.1016/j.pat-cog.2018.10.005.
Article Google Scholar
Z. X. Wu, T. Nagarajan, A. Kumar, S. Rennie, L. S. Davis, K. Grauman, R. Feris. BlockDrop: Dynamic inference paths in residual networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8817–8826, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00919.
Google Scholar
X. Wang, F. S. Yu, Z. Y. Dou, T. Darrell, J. E. Gonzalez. SkipNet: Learning dynamic routing in convolutional networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 409–424, 2018. DOI: https://doi.org/10.1007/978-3-030-01261-8_25.
Google Scholar
N. Shazeer, K. Fatahalian, W. R. Mark, R. T. Mullapudi. HydraNets: Specialized dynamic architectures for efficient inference. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8080–8089, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00843.
Google Scholar
G. Huang, D. L. Chen, T. H. Li, F. Wu, L. Van Der Maaten, K. Q. Weinberger. Multi-scale dense networks for resource efficient image classification. [Online], Available: https://arxiv.org/abs/1703.09844, 2017.
Z. H. You, K. Yan, J. M. Ye, M. Ma, P. Wang. Gate decorator: Global filter pruning method for accelerating deep convolutional neural networks. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 2133–2144, 2019.
Y. W. Li, L. Song, Y. K. Chen, Z. M. Li, X. Y. Zhang, X. G. Wang, J. Sun. Learning dynamic routing for semantic segmentation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8553–8562, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00858.
Google Scholar
B. De Brabandere, X. Jia, T. Tuytelaars, L. Van Gool. Dynamic filter networks. In Proceedings of the 30th Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 667–675, 2016.
S. Niklaus, L. Mai, F. Liu. Video frame interpolation via adaptive separable convolution. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 261–270, 2017. DOI: https://doi.org/10.1109/ICCV.2017.37.
Google Scholar
Y. Jo, S. W. Oh, J. Kang, S. J. Kim. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3224–3232, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00340.
Google Scholar
X. Y. Xu, M. C. Li, W. X. Sun. Learning deformable kernels for image and video denoising. [Online], Available: https://arxiv.org/abs/1904.06903, 2019.
B. Mildenhall, J. T. Barron, J. W. Chen, D. Sharlet, R. Ng, R. Carroll. Burst denoising with kernel prediction networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 2502–2510, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00265.
Google Scholar
Y. S. Xu, S. Y. R. Tseng, Y. Tseng, H. K. Kuo, Y. M. Tsai. Unified dynamic convolutional network for super-resolution with variational degradations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 12496–12505, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01251.
Google Scholar
J. F. Dai, H. Z. Qi, Y. W. Xiong, Y. Li, G. D. Zhang, H. Hu, Y. C. Wei. Deformable convolutional networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 764–773, 2017. DOI: https://doi.org/10.1109/ICCV.2017.89.
Google Scholar
X. Z. Zhu, H. Hu, S. Lin, J. F. Dai. Deformable ConvNets V2: More deformable, better results. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 9308–9316, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00953.
Google Scholar
C. F. Xu, B. C. Wu, Z. N. Wang, W. Zhan, P. Vajda, K. Keutzer, M. Tomizuka. SqueezeSegV3: Spatially-adaptive convolution for efficient point-cloud segmentation. [Online], Available: https://arxiv.org/abs/2004.01803, 2020.
Z. T. Xiong, Y. Yuan, N. H. Guo, Q. Wang. Variational context-deformable convnets for indoor scene parsing. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 3992–4002, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00405.
Google Scholar
W. Maass. Networks of spiking neurons: The third generation of neural network models. Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997. DOI: https://doi.org/10.1016/S0893-6080(97)00011-7.
Article Google Scholar
S. Kim, S. Park, B. Na, S. Yoon. Spiking-YOLO: Spiking neural network for energy-efficient object detection. [Online], Available: https://arxiv.org/abs/1903.06530, 2019.
Z. F. Mainen, T. J. Sejnowski. Reliability of spike timing in neocortical neurons. Science, vol. 268, no. 5216, pp. 1503–1506, 1995. DOI: https://doi.org/10.1126/science.7770778.
Article Google Scholar
G. Bellec, D. Salaj, A. Subramoney, R. Legenstein, W. Maass. Long short-term memory and learning-to-learn in networks of spiking neurons. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, USA, pp. 787–797, 2018.
Google Scholar
Y. Q. Cao, Y. Chen, D. Khosla. Spiking deep convolutional neural networks for energy-efficient object recognition. International Journal of Computer Vision, vol. 113, no. 1, pp. 54–66, 2015. DOI: https://doi.org/10.1007/s11263-014-0788-3.
Article MathSciNet Google Scholar
D. Comaniciu, V. Ramesh, P. Meer. Real-time tracking of non-rigid objects using mean shift. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Hilton Head Island, USA, pp. 142–149, 2000. DOI: https://doi.org/10.1109/CVPR.2000.854761.
Google Scholar
N. Dalal, B. Triggs. Histograms of oriented gradients for human detection. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 886–893, 2005. DOI: https://doi.org/10.1109/CVPR.2005.177.
Google Scholar
J. F. Henriques, R. Caseiro, P. Martins, J. Batista. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the 12th European Conference on Computer Vision, Springer, Florence, Italy, pp. 702–715, 2012. DOI: https://doi.org/10.1007/978-3-642-33765-9_50.
Google Scholar
J. F. Henriques, R. Caseiro, P. Martins, J. Batista. Highspeed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 3, pp. 583–596, 2015. DOI: https://doi.org/10.1109/TPAMI.2014.2345390.
Article Google Scholar
L. Bertinetto, J. Valmadre, S. Golodetz, O. Miksik, P. H. S. Torr. Staple: Complementary learners for real-time tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1401–1409, 2016. DOI: https://doi.org/10.1109/CVPR.2016.156.
Google Scholar
C. Ma, J. B. Huang, X. K. Yang, M. H. Yang. Hierarchical convolutional features for visual tracking. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 3074–3082, 2015. DOI: https://doi.org/10.1109/ICCV.2015.352.
Google Scholar
Y. K. Qi, S. P. Zhang, L. Qin, H. X. Yao, Q. M. Huang, J. Lim, M. H. Yang. Hedged deep tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 4303–4311, 2016. DOI: https://doi.org/10.1109/CVPR.2016.466.
Google Scholar
L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. S. Torr. Fully-convolutional Siamese networks for object tracking. In Proceedings of European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 850–865, 2016. DOI: https://doi.org/10.1007/978-3-319-48881-3_56.
Google Scholar
B. Li, W. Wu, Q. Wang, F. Y. Zhang, J. L. Xing, J. J. Yan. SiamRPN++: Evolution of Siamese visual tracking with very deep networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4282–4291, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00441.
Google Scholar
A. Lukezic, J. Matas, M. Kristan. D3S-A discriminative single shot segmentation tracker. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7133–7142, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00716.
Google Scholar
M. Y. Wu, H. B. Ling, N. Bi, S. H. Gao, Q. Hu, H. Sheng, J. Y. Yu. Visual tracking with multiview trajectory prediction. IEEE Transactions on Image Processing, vol. 29, pp. 8355–8367, 2020. DOI: https://doi.org/10.1109/TIP.2020.3014952.
Article Google Scholar
Z. D. Chen, B. N. Zhong, G. R. Li, S. P. Zhang, R. R. Ji. Siamese box adaptive network for visual tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6668–6677, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00670.
Google Scholar
Z. P. Zhang, H. W. Peng. Deeper and wider Siamese networks for real-time visual tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4591–4600, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00472.
Google Scholar
L. Y. Zheng, Y. Y. Chen, M. Tang, J. Q. Wang, H. Q. Lu. Siamese deformable cross-correlation network for realtime visual tracking. Neurocomputing, vol. 401, pp. 36–47, 2020. DOI: https://doi.org/10.1016/j.neucom.2020.02.080.
Article Google Scholar
Y. C. Yu, Y. L. Xiong, W. L. Huang, M. R. Scott. Deformable Siamese attention networks for visual object tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6728–6737, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00676.
Google Scholar
Z. Teng, J. L. Xing, Q. Wang, B. P. Zhang, J. P. Fan. Deep spatial and temporal network for robust visual object tracking. IEEE Transactions on Image Processing, vol. 29, pp. 1762–1775, 2019. DOI: https://doi.org/10.1109/TIP.2019.2942502.
Article MathSciNet Google Scholar
M. H. Abdelpakey, M. S. Shehata. DP-Siam: Dynamic policy Siamese network for robust object tracking. IEEE Transactions on Image Processing, vol. 29, pp. 1479–1492, 2019. DOI: https://doi.org/10.1109/TIP.2019.2942506.
Article MathSciNet Google Scholar
P. X. Li, B. Y. Chen, W. L. Ouyang, D. Wang, X. Y. Yang, H. C. Lu. GradNet: Gradient-guided network for visual object tracking. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 6162–6171, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00626.
Google Scholar
D. Y. Guo, J. Wang, Y. Cui, Z. H. Wang, S. Y. Chen. SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6269–6277, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00630.
Google Scholar
Q. Wang, L. Zhang, L. Bertinetto, W. M. Hu, P. H. S. Torr. Fast online object tracking and segmentation: A unifying approach. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1328–1338, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00142.
Google Scholar
H. Fan, H. B. Ling. Siamese cascaded region proposal networks for real-time visual tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 7952–7961, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00814.
Google Scholar
G. Bhat, J. Johnander, M. Danelljan, F. S. Khan, M. Felsberg. Unveiling the power of deep tracking. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 483–498, 2018. DOI: https://doi.org/10.1007/978-3-030-01216-8_30.
Google Scholar
S. M. Ge, Z. Luo, C. H. Zhang, Y. Y. Hua, D. C. Tao. Distilling channels for efficient deep tracking. IEEE Transactions on Image Processing, vol. 29, pp. 2610–2621, 2019. DOI: https://doi.org/10.1109/TIP.2019.2950508.
Article Google Scholar
Y. D. Xu, Z. Y. Wang, Z. X. Li, Y. Ye, G. Yu. SiamFC++: Towards robust and accurate visual tracking with target estimation guidelines. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 12549–12556, 2020. DOI: https://doi.org/10.1609/aaai.v34i07.6944.
Article Google Scholar
Y. Yang, G. Li, Y. Qi, Q. Huang. Release the power of online-training for robust visual tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 12645–12652, 2020. DOI: https://doi.org/10.1609/aaai.v34i07.6956.
Article Google Scholar
J. H. Zhou, P. Wang, H. Y. Sun. Discriminative and robust online learning for Siamese visual tracking. In Proceedings of AAAI Conference on Artificial Intelligence, vol. 34, no. 7, AIAA, pp. 13017–13024, 2020.
Article Google Scholar
J. Choi, H. J. Chang, T. Fischer, S. Yun, K. Lee, J. Jeong, Y. Demiris, J. Y. Choi. Context-aware deep feature compression for high-speed visual tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 479–488, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00057.
Google Scholar
B. Li, J. J. Yan, W. Wu, Z. Zhu, X. L. Hu. High performance visual tracking with Siamese region proposal network. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8971–8980, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00935.
Google Scholar
Y. B. Song, C. Ma, X. H. Wu, L. J. Gong, L. C. Bao, W. M. Zuo, C. H. Shen, R. W. H. Lau, M. H. Yang. Vital: Visual tracking via adversarial learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8990–8999, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00937.
Google Scholar
Z. H. Lai, E. Lu, W. D. Xie. MAST: A memory-augmented self-supervised tracker. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6479–6488 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00651.
E. Park, A. C. Berg. Meta-tracker: Fast and robust online adaptation for visual object trackers. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 569–585, 2018. DOI: https://doi.org/10.1007/978-3-030-01219-9_35.
Google Scholar
X. P. Dong, J. B. Shen, L. Shao, F. Porikli. CLNet: A compact latent network for fast adjusting Siamese trackers. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 378–395, 2020. DOI: https://doi.org/10.1007/978-3-030-58565-5_23.
Google Scholar
Z. P. Zhang, H. W. Peng, J. L. Fu, B. Li, W. M. Hu. Ocean: Object-aware anchor-free tracking. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 771–787, 2020. DOI: https://doi.org/10.1007/978-3-030-58589-1_46.
Google Scholar
Y. Liu, R. T. Li, Y. Cheng, R. T. Tan, X. B. Sui. Object tracking using spatio-temporal networks for future prediction location. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 1–17, 2020. DOI: https://doi.org/10.1007/978-3-030-58542-6_1.
Google Scholar
B. Y. Liao, C. Y. Wang, Y. Y. Wang, Y. N. Wang, J. Yin. PG-Net: Pixel to global matching network for visual tracking. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 429–444, 2020. DOI: https://doi.org/10.1007/978-3-030-58542-6_26.
Google Scholar
L. H. Huang, X. Zhao, K. Q. Huang. Bridging the gap between detection and tracking: A unified approach. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3999–4009, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00410.
Google Scholar
W. X. Liu, Y. B. Song, D. S. Chen, S. F. He, Y. L. Yu, T. Yan, G. P. Hancke, R. W. H. Lau. Deformable object tracking with gated fusion. IEEE Transactions on Image Processing, vol. 28, no. 8, pp. 3766–3777, 2019. DOI: https://doi.org/10.1109/TIP.2019.2902784.
Article MathSciNet MATH Google Scholar
Z. Y. Liang, J. B. Shen. Local semantic Siamese networks for fast tracking. IEEE Transactions on Image Processing, vol. 29, pp. 3351–3364, 2019. DOI: https://doi.org/10.1109/TIP.2019.2959256.
Article Google Scholar
A. F. He, C. Luo, X. M. Tian, W. J. Zeng. A twofold Siamese network for real-time object tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4834–4843, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00508.
Google Scholar
J. Y. Gao, T. Z. Zhang, C. S. Xu. Graph convolutional tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4649–4659, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00478.
Google Scholar
Y. H. Zhang, L. J. Wang, J. Q. Qi, D. Wang, M. Y. Feng, H. C. Lu. Structured Siamese network for real-time visual tracking. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 351–366, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_22.
Google Scholar
K. P. Li, Y. Kong, Y. Fu. Visual object tracking via multi-stream deep similarity learning networks. IEEE Transactions on Image Processing, vol. 29, pp. 3311–3320, 2019. DOI: https://doi.org/10.1109/TIP.2019.2959249.
Article Google Scholar
G. T. Wang, C. Luo, X. Y. Sun, Z. W. Xiong, W. J. Zeng. Tracking by instance detection: A meta-learning approach. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6288–6297, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00632.
Google Scholar
T. Y. Yang, P. F. Xu, R. B. Hu, H. Chai, A. B. Chan. ROAM: Recurrently optimizing tracking model. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6718–6727, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00675.
Google Scholar
X. K. Lu, C. Ma, B. B. Ni, X. K. Yang, I. Reid, M. H. Yang. Deep regression tracking with shrinkage loss. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 353–369, 2018. DOI: https://doi.org/10.1007/978-3-030-01264-9_22.
Google Scholar
H. Z. Zhou, B. Ummenhofer, T. Brox. DeepTAM: Deep tracking and mapping. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 822–838, 2018. DOI: https://doi.org/10.1007/978-3-030-01270-0_50.
Google Scholar
X. Y. Zhou, V. Koltun, P. Krahenbuhl. Tracking objects as points. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 474–490, 2020. DOI: https://doi.org/10.1007/978-3-030-58548-8_28.
Google Scholar
Y. Sui, Z. M. Zhang, G. H. Wang, Y. F. Tang, L. Zhang. Exploiting the anisotropy of correlation filter learning for visual tracking. International Journal of Computer Vision, vol. 127, no. 8, pp. 1084–1105, 2019. DOI: https://doi.org/10.1007/s11263-019-01156-6.
Article Google Scholar
H. Z. Zhou, B. Ummenhofer, T. Brox. DeepTAM: Deep tracking and mapping with convolutional neural networks. International Journal of Computer Vision, vol. 128, no. 3, pp. 756–769, 2020. DOI: https://doi.org/10.1007/s11263-019-01221-0.
Article Google Scholar
L. H. Huang, X. Zhao, K. Q. Huang. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. DOI: https://doi.org/10.1109/TPAMI.2019.2957464.
H. Fan, L. T. Lin, F. Yang, P. Chu, G. Deng, S. J. Yu, H. X. Bai, Y. Xu, C. Y. Liao, H. B. Ling. LaSOT: A high-quality benchmark for large-scale single object tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 5374–5383, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00552.
Google Scholar
M. Kristan, R. Pflugfelder, A. Leonardis, et al. The visual object tracking VOT2014 challenge results. In Proceedings of the European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 191–217, 2014. DOI:10.1007/978-3-319-16181-5_14.
Google Scholar
Y. Wu, J. Lim, M. H. Yang. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1834–1848, 2015. DOI: https://doi.org/10.1109/TPAMI.2014.2388226.
Article Google Scholar
M. Kristan, A. Leonardis, J. Matas, et al. The visual object tracking VOT2017 challenge results. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Venice, Italy, pp. 1949–1972, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.230.
Google Scholar
M. Kristan, J. Matas, A. Leonardis, et al. The seventh visual object tracking VOT2019 challenge results. In Proceedings of IEEE/CVF International Conference on Computer Vision Workshop, IEEE, Seoul, Korea, pp. 2206–2241, 2019. DOI: https://doi.org/10.1109/ICCVW.2019.00276.
Google Scholar
A. W. M. Smeulders, D. M. Chu, R. Cucchiara, S. Calderara, A. Dehghan, M. Shah. Visual tracking: An experimental survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 7, pp. 1442–1468, 2014. DOI: https://doi.org/10.1109/TPAMI.2013.230.
Article Google Scholar
A. N. Li, M. Lin, Y. Wu, M. H. Yang, S. C. Yan. NUSPRO: A new visual tracking challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 335–349, 2015. DOI: https://doi.org/10.1109/TPAMI.2015.2417577.
Article Google Scholar
P. P. Liang, E. Blasch, H. B. Ling. Encoding color information for visual tracking: Algorithms and benchmark. IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5630–5644, 2015. DOI: https://doi.org/10.1109/TIP.2015.2482905.
Article MathSciNet MATH Google Scholar
H. K. Galoogahi, A. Fagg, C. Huang, D. Ramanan, S. Lucey. Need for speed: A benchmark for higher frame rate object tracking. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1125–1134, 2017. DOI: https://doi.org/10.1109/ICCV.2017.128.
Google Scholar
M. Mueller, N. Smith, B. Ghanem. A benchmark and simulator for UAV tracking. In Proceedings of 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 445–461, 2016. DOI: https://doi.org/10.1007/978-3-319-46448-0_27.
Google Scholar
J. Valmadre, L. Bertinetto, J. F. Henriques, R. Tao, A. Vedaldi, A. W. M. Smeulders, P. H. S. Torr, E. Gavves. Long-term tracking in the wild: A benchmark. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 670–685, 2018. DOI: https://doi.org/10.1007/978-3-030-01219-9_41.
Google Scholar
M. Muller, A. Bibi, S. Giancola, S. Alsubaihi, B. Ghanem. TrackingNet: A large-scale dataset and benchmark for object tracking in the wild. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 300–317, 2018. DOI: https://doi.org/10.1007/978-3-030-01246-5_19.
Google Scholar
L. Leal-Taixe, A. Milan, I. Reid, S. Roth, K. Schindler. Motchallenge 2015: Towards a benchmark for multi-target tracking. [Online], Available: https://arxiv.org/abs/1504.01942, 2015.
A. Milan, L. Leal-Taixe, I. Reid, S. Roth, K. Schindler. MOT16: A benchmark for multi-object tracking. [Online], Available: https://arxiv.org/abs/1603.00831, 2016.
A. Geiger, P. Lenz, R. Urtasun. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 3354–3361, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6248074.
Google Scholar
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/s11263-015-0816-y.
Article MathSciNet Google Scholar
E. Real, J. Shlens, S. Mazzocchi, X. Pan, V. Vanhoucke. YouTube-boundingBoxes: A large high-precision human-annotated data set for object detection in video. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5296–5305, 2017. DOI: https://doi.org/10.1109/CVPR.2017.789.
Google Scholar
M. Kristan, J. Matas, A. Leonardis, M. Felsberg, L. Cehovin, G. Fernandez, T. Vojir, G. Hager, G. Nebehay, R. Pflugfelder, A. Gupta, A. Bibi, A. Lukezic, A. Garcia-Martin, A. Saffari, A. Petrosino, A. S. Montero. The visual object tracking VOT2015 challenge results. In Proceedings of IEEE International Conference on Computer Vision Workshop, IEEE, Santiago, Chile, pp. 1–23, 2015. DOI: https://doi.org/10.1109/ICCVW.2015.79.
Google Scholar
S. Had, R. Bowden, K. Lebeda. The visual object tracking VOT2016 challenge results. Lecture Notes in Computer Science, vol. 9914, pp. 777–823, 2016.
Article Google Scholar
M. Kristan, A. Leonardis, J. Matas, et al. The sixth visual object tracking VOT2018 challenge results. In Proceedings of the European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–53, 2018. DOI: https://doi.org/10.1007/978-3-030-11009-3_1.
Google Scholar
G. A. Miller. WordNet: An Electronic Lexical Database. Cambridge, USA: MIT Press, 1998.
MATH Google Scholar
X. Li, C. Ma, B. Y. Wu, Z. Y. He, M. H. Yang. Targetaware deep tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, CA, USA, pp. 1369–1378, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00146.
Google Scholar
T. Y. Yang, A. B. Chan. Learning dynamic memory networks for object tracking. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 152–167, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_10.
Google Scholar
X. P. Dong, J. B. Shen. Triplet loss in Siamese network for object tracking. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 459–474, 2018. DOI: https://doi.org/10.1007/978-3-030-01261-8_28.
Google Scholar
Y. Sui, Y. F. Tang, L. Zhang, G. H. Wang. Visual tracking via subspace learning: A discriminative approach. International Journal of Computer Vision, vol. 126, no. 5, pp. 515–536, 2018. DOI: https://doi.org/10.1007/s11263-017-1049-z.
Article MathSciNet MATH Google Scholar
C. Ma, J. B. Huang, X. K. Yang, M. H. Yang. Robust visual tracking via hierarchical convolutional features. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 11, pp. 2709–2723, 2019. DOI: https://doi.org/10.1109/TPAMI.2018.2865311.
Article Google Scholar
N. Wang, Y. B. Song, C. Ma, W. G. Zhou, W. Liu, H. Q. Li. Unsupervised deep tracking. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1308–1317, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00140.
Google Scholar
P. Voigtlaender, J. Luiten, P. H. S. Torr, B. Leibe. Siam R-CNN: Visual tracking by re-detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 6578–6588, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00661.
Google Scholar
Z. Zhu, Q. Wang, B. Li, W. Wu, J. J. Yan, W. M. Hu. Distractor-aware Siamese networks for visual object tracking. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 101–117, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_7.
Google Scholar
L. L. Ren, X. Yuan, J. W. Lu, M. Yang, J. Zhou. Deep reinforcement learning with iterative shift for visual tracking. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 684–700, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_42.
Google Scholar
H. Fan, H. B. Ling. Parallel tracking and verifying. IEEE Transactions on Image Processing, vol. 28, no. 8, pp. 4130–4144, 2019. DOI: https://doi.org/10.1109/TIP.2019.2904789.
Article MATH Google Scholar
G. Bhat, M. Danelljan, L. Van Gool, R. Timofte. Learning discriminative model prediction for tracking. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 6182–6191, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00628.
Google Scholar
K. N. Dai, Y. H. Zhang, D. Wang, J. H. Li, H. C. Lu, X. Y. Yang. High-performance long-term tracking with meta-updater. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, WA, USA, pp. 6298–6307, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00633.
Google Scholar
L. H. Huang, X. Zhao, K. Q. Huang. GlobalTrack: A simple and strong baseline for long-term tracking. [Online], Available: https://arxiv.org/abs/1912.08531, 2019.
M. Danelljan, G. Bhat, F. S. Khan, M. Felsberg. ATOM: Accurate tracking by overlap maximization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, CA, USA, pp. 4660–4669, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00479.
Google Scholar
J. F. Han, P. Luo, X. G. Wang. Deep self-learning from noisy labels. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 5138–5147, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00524.
Google Scholar
F. Q. Liu, Z. Y. Automatic “ground truth” annotation and industrial workpiece dataset generation for deep learning. International Journal of Automation and Computing, vol. 17, no. 4, pp. 539–550, 2020. DOI: https://doi.org/10.1007/s11633-020-1221-8.
Article Google Scholar
Q. Fu, X. Y. Chen, W. A survey on 3D visual tracking of multicopters. International Journal of Automation and Computing, vol. 16, no. 6, pp. 707–719, 2019. DOI: https://doi.org/10.1007/s11633-019-1199-2.
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 61922064 and U2033210), Zhejiang Provincial Natural Science Foundation (Nos. LR17F030001 and LQ19F020005), the Project of Science and Technology Plans of Wenzhou City (Nos. C20170008 and ZG2017016).

Author information

Authors and Affiliations

College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325035, China
Xiao-Qin Zhang, Run-Hua Jiang, Chen-Xiang Fan, Tian-Yu Tong, Tao Wang & Peng-Cheng Huang

Authors

Xiao-Qin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Run-Hua Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Chen-Xiang Fan
View author publications
You can also search for this author in PubMed Google Scholar
Tian-Yu Tong
View author publications
You can also search for this author in PubMed Google Scholar
Tao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peng-Cheng Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Qin Zhang.

Additional information

Recommended by Associate Editor Nazim Mir-Nasiri

Xiao-Qin Zhang received the B.Sc. degree in electronic information science and technology from Central South University, China in 2005, and Ph.D. degree in pattern recognition and intelligent system from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China in 2010. He is currently a professor in Wenzhou University, China. He has published more than 80 papers in international and national journals, and international conferences, including IEEE T-PAMI, IJCV, IEEE T-IP, IEEE T-IE, IEEE T-C, ICCV, CVPR, NIPS, IJCAI, AAAI, etc.

His research interests include pattern recognition, computer vision and machine learning.

Run-Hua Jiang received the B.Sc. degree in network engineering from Department of Information Science, Tianjin University of Finance and Economy, China in 2017. He is currently a graduate student in computer software and theory at College of Computer Science and Artificial Intelligence, Wenzhou University, China.

His research interests include several computer vision tasks, such as image/video restoration, crowd counting, visual understanding, and video question answering.

Chen-Xiang Fan received the B.Sc. degree in information and computing science from Department of Information and Computing Science, Ludong University, China in 2020. He is currently a graduate student majoring in computer software and theory at College of Computer Science and Artificial Intelligence, Wenzhou University, China.

His research interests include machine learning, recommendation system and object tracking.

Tian-Yu Tong is currently an undergraduate student in data science and big data technology at College of Computer Science and Artificial Intelligence, Wenzhou University, China.

His research interests include big data technology, pattern recognition and machine learning.

Tao Wang received the B.Sc. degree in information and computing science from Hainan Normal University, China in 2018. He is currently a graduate student at College of Computer Science and Artificial Intelligence, Wenzhou University, China.

His research interests include several topics in computer vision, such as image/ video quality restoration, adversarial learning, visual tracking, image-to-image translation, reinforcement learning.

Peng-Cheng Huang received the B.Sc. degree in electrical engineering and automation from Department of Modern Science and Technology, China Metrology University, China in 2018. He is currently a graduate student in computer software and theory at College of Computer Science and Artificial Intelligence, Wenzhou University, China.

His research interests include image and video processing, pattern recognition and machine learning.

Rights and permissions

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, XQ., Jiang, RH., Fan, CX. et al. Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals. Int. J. Autom. Comput. 18, 311–333 (2021). https://doi.org/10.1007/s11633-020-1274-8

Download citation

Received: 15 September 2020
Accepted: 30 December 2020
Published: 04 March 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11633-020-1274-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Abstract

Article PDF

Similar content being viewed by others

A systematic survey on recent deep learning-based approaches to multi-object tracking

Handcrafted and Deep Trackers: A Survey

A survey on online learning for visual tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Abstract

Article PDF

Similar content being viewed by others

A systematic survey on recent deep learning-based approaches to multi-object tracking

Handcrafted and Deep Trackers: A Survey

A survey on online learning for visual tracking

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation