research-article

PP-LinkNet: Improving Semantic Segmentation of High Resolution Satellite Imagery with Multi-stage Training

Authors:

Jagannadan Varadarajan,

Hannes KruppaAuthors Info & Claims

SUMAC'20: Proceedings of the 2nd Workshop on Structuring and Understanding of Multimedia heritAge Contents

Pages 57 - 64

https://doi.org/10.1145/3423323.3423407

Published: 12 October 2020 Publication History

Abstract

Road network and building footprint extraction is essential for many applications such as updating maps, traffic regulations, city planning, ride-hailing, disaster response etc. Mapping road networks is currently both expensive and labor-intensive. Recently, improvements in image segmentation through the application of deep neural networks has shown promising results in extracting road segments from large scale, high resolution satellite imagery. However, significant challenges remain due to lack of enough labeled training data needed to build models for industry grade applications. In this paper, we propose a two-stage transfer learning technique to improve robustness of semantic segmentation for satellite images that leverages noisy pseudo ground truth masks obtained automatically (without human labor) from crowd-sourced OpenStreetMap (OSM) data. We further propose Pyramid Pooling-LinkNet (PP-LinkNet), an improved deep neural network for segmentation that uses focal loss, poly learning rate, and context module. We demonstrate the strengths of our approach through evaluations done on three popular datasets over two tasks, namely, road extraction and building foot-print detection. Specifically, we obtain 78.19% meanIoU on SpaceNet building footprint dataset, 67.03% and 77.11% on the road topology metric on SpaceNet and DeepGlobe road extraction dataset, respectively.

References

[1]

Nicolas Audebert, Bertrand Le Saux, and Sebastien Lefevre. 2017. Joint Learning From Earth Observation and OpenStreetMap Data to Get Faster Better Semantic Maps. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

[2]

Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 12 (2017), 2481--2495. https://doi.org/10.1109/TPAMI.2016.2644615 arxiv: 1511.00561

[3]

Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, and David DeWitt. 2018. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. (2018). https://doi.org/10.1109/CVPR.2018.00496 arxiv: 1802.03680

[4]

Anil Batra, Suriya Singh, Guan Pang, Saikat Basu, C V Jawahar, and Manohar Paluri. 2019. Improved Road Connectivity by Joint Learning of Orientation and Segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]

Samuel Rota Bulo, Lorenzo Porzi, and Peter Kontschieder. 2018. In-place Activated BatchNorm for Memory-Optimized Training of DNNs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 5639--5647. https://doi.org/10.1109/CVPR.2018.00591 arxiv: 1712.02616

[6]

Abhishek Chaurasia and Eugenio Culurciello. 2017. LinkNet: Exploiting encoder representations for efficient semantic segmentation. In IEEE Visual Communications and Image Processing (VCIP). https://doi.org/10.1109/VCIP.2017.8305148 arxiv: 1707.03718

[7]

Liang Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 4 (2018), 834--848. https://doi.org/10.1109/TPAMI.2017.2699184 arxiv: 1606.00915

[8]

Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddala, Sanyam Garg, Barrett Doo, and Ramesh Raskar. 2018a. Generative Street Addresses from Satellite Imagery. ISPRS International Journal of Geo-Information, Vol. 7, 3 (2018). https://doi.org/10.3390/ijgi7030084

[9]

Ilke Demir, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, and Ramesh Raska. 2018b. DeepGlobe 2018: A challenge to parse the earth through satellite images. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). https://doi.org/10.1109/CVPRW.2018.00031 arxiv: 1805.06561

[10]

J Deng, W Dong, R Socher, L.-J. Li, K Li, and L Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]

M Everingham, L Van Gool, C K I Williams, J Winn, and A Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, Vol. 88, 2 (2010), 303--338.

Digital Library

[12]

Sergey Golovanov, Rauf Kurbanov, Aleksey Artamonov, Alex Davydow, and Sergey Nikolenko. 2018. Building detection from satellite imagery using a composite loss function. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. https://doi.org/10.1109/CVPRW.2018.00040

[13]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and Harnessing Adversarial Examples. (2014). arxiv: 1412.6572 http://arxiv.org/abs/1412.6572

[14]

Ryuhei Hamaguchi and Shuhei Hikosaka. 2018. Building Detection From Satellite Imagery Using Ensemble of Size-Specific Detectors. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

[15]

Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross Girshick. 2017. Mask R-CNN. (mar 2017). arxiv: 1703.06870 http://arxiv.org/abs/1703.06870

[16]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2016-Decem. 770--778. https://doi.org/10.1109/CVPR.2016.90 arxiv: 1512.03385

[17]

Songtao He, Favyen Bastani, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Mohamed M. Elshrif, Samuel Madden, and Amin Sadeghi. 2020. Sat2Graph: Road Graph Extraction through Graph-Tensor Encoding. (jul 2020). arxiv: 2007.09547 http://arxiv.org/abs/2007.09547

[18]

Geoffrey E. Hinton, Simon Osindero, and Yee Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation, Vol. 18, 7 (2006), 1527--1554. https://doi.org/10.1162/neco.2006.18.7.1527

Digital Library

[19]

Humanitarian OpenStreetMap Team. [n.d.]. https://export.hotosm.org/en/v3/. https://export.hotosm.org/en/v3/

[20]

Vladimir Iglovikov, Selim Seferbekov, Alexander Buslaev, and Alexey Shvets. 2018. TernausNetV2: Fully convolutional network for instance segmentation. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 228--232. https://doi.org/10.1109/CVPRW.2018.00042 arxiv: 1806.00844

[21]

Tsung-Yi Lin, Piotr Dollá r, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. In CVPR .arxiv: 1612.03144 http://arxiv.org/abs/1612.03144

[22]

Tsung Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollá r, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In European Conference on Computer Vision, Vol. 8693 LNCS. 740--755. https://doi.org/10.1007/978--3--319--10602--1_48 arxiv: 1405.0312

[23]

Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2016. ParseNet: Looking Wider to See Better. ICLR (jun 2016). arxiv: 1506.04579 http://arxiv.org/abs/1506.04579

[24]

Ye Luo, Loong-Fah Cheong, and An Tran. 2015. Actionness-assisted Recognition of Actions. In The IEEE International Conference on Computer Vision (ICCV).

[25]

Gellert Mattyus, Wenjie Luo, and Raquel Urtasun. 2017. DeepRoadMapper: Extracting Road Topology From Aerial Images. In The IEEE International Conference on Computer Vision (ICCV).

[26]

Gellert Mattyus and Raquel Urtasun. 2018. Matching Adversarial Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]

Gellert Mattyus, Shenlong Wang, Sanja Fidler, and Raquel Urtasun. 2015. Enhancing Road Maps by Parsing Aerial Images Around the World. In International Conference on Computer Vision (ICCV).

Digital Library

[28]

Fausto Milletari, Nassir Navab, and Seyed Ahmad Ahmadi. 2016. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings - 2016 4th International Conference on 3D Vision, 3DV 2016 (2016), 565--571. https://doi.org/10.1109/3DV.2016.79 arxiv: 1606.04797

[29]

Agata Mosinska, Pablo Marquez-Neila, Mateusz Kozinski, and Pascal Fua. 2018. Beyond the Pixel-Wise Loss for Topology-Aware Delineation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 3136--3145. https://doi.org/10.1109/CVPR.2018.00331 arxiv: 1712.02190

[30]

Sharan Narang, Gregory Diamos, Erich Elsen, Paulius Micikevicius, Jonah Alben, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. 2018. Mixed precision training. In 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings .arxiv: 1710.03740 http://arxiv.org/abs/1710.03740

[31]

Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulò, and Peter Kontschieder. 2017. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes. In International Conference on Computer Vision (ICCV). https://www.mapillary.com/dataset/vistas

[32]

Adam Paszke, Abhishek Chaurasia, Sangpil Kim, and Eugenio Culurciello. 2016. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. (2016). arxiv: 1606.02147 http://arxiv.org/abs/1606.02147

[33]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. (jun 2015). arxiv: 1506.01497 http://arxiv.org/abs/1506.01497

[34]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9351 (2015), 234--241. https://doi.org/10.1007/978--3--319--24574--4_28 arxiv: 1505.04597

[35]

Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 39. 640--651. https://doi.org/10.1109/TPAMI.2016.2572683 arxiv: 1411.4038

[36]

Suriya Singh, Anil Batra, Guan Pang, Lorenzo Torresani, Saikat Basu, Manohar Paluri, and C. V. Jawahar. 2018. Self-supervised Feature Learning for Semantic Segmentation of Overhead Imagery. In British Machine Vision Conference (BMVC), Vol. 1.

[37]

An Tran and Loong-Fah Cheong. 2017. Two-stream Flow-guided Convolutional Attention Networks for Action Recognition. In The IEEE International Conference on Computer Vision Workshop (ICCVW).

[38]

USGS. [n.d.]. https://earthexplorer.usgs.gov/. https://earthexplorer.usgs.gov/

[39]

Adam Van Etten, Dave Lindenbaum, and Todd M. Bacastow. 2018. SpaceNet: A Remote Sensing Dataset and Challenge Series. (2018). arxiv: 1807.01232 http://arxiv.org/abs/1807.01232

[40]

Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. 2018. Understanding Convolution for Semantic Segmentation. In IEEE Winter Conference on Applications of Computer Vision (WACV). 1451--1460. https://doi.org/10.1109/WACV.2018.00163 arxiv: 1702.08502

[41]

Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang, Justin Liang, Joel Cheverie, Sanja Fidler, and Raquel Urtasun. 2017. TorontoCity: Seeing the World with a Million Eyes. In Proceedings of the IEEE International Conference on Computer Vision. https://doi.org/10.1109/ICCV.2017.327 arxiv: 1612.00423

[42]

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2017.660 arxiv: 1612.01105v2

[43]

Lichen Zhou, Chuang Zhang, and Ming Wu. 2018. D-linknet: Linknet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Vol. 2018-June. https://doi.org/10.1109/CVPRW.2018.00034

Cited By

Münster SMaiwald Fdi Lenardo IHenriksson JIsaac AGraf MBeck COomen J(2024)Artificial Intelligence for Digital Heritage Innovation: Setting up a R&D Agenda for EuropeHeritage10.3390/heritage70200387:2(794-816)Online publication date: 6-Feb-2024
https://doi.org/10.3390/heritage7020038
Münster SMaiwald FBruschke JKröber CSun YDworak DKomorowicz DMunir IBeck CMünster D(2024)A Digital 4D Information System on the World Scale: Research Challenges, Approaches, and Preliminary ResultsApplied Sciences10.3390/app1405199214:5(1992)Online publication date: 28-Feb-2024
https://doi.org/10.3390/app14051992
Chen XYu ASun QGuo WXu QWen B(2024)Updating road maps at city scale with remote sensed images and existing vector mapsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.3375807(1-1)Online publication date: 2024
https://doi.org/10.1109/TGRS.2024.3375807
Show More Cited By

Index Terms

PP-LinkNet: Improving Semantic Segmentation of High Resolution Satellite Imagery with Multi-stage Training
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
2. Human-centered computing
  1. Visualization
    1. Visualization application domains
      1. Geographic visualization
      2. Information visualization

Recommendations

Leveraging Road Area Semantic Segmentation with Auxiliary Steering Task
Image Analysis and Processing – ICIAP 2022
Abstract
Robustness of different pattern recognition methods is one of the key challenges in autonomous driving, especially when driving in the high variety of road environments and weather conditions, such as gravel roads and snowfall. Although one can ...
Segmentation of multispectral high-resolution satellite imagery using log Gabor filters
Geoinformatics 2007

Image segmentation has been recognized as a valuable approach that performs a region-based rather than a pixel-based analysis of high-resolution satellite imagery. A scheme for segmenting the multispectral IKONOS image based on frequency-domain ...
Automatic vehicles detection from high resolution satellite imagery using morphological neural networks
ICCOMP'06: Proceedings of the 10th WSEAS international conference on Computers

This paper presents a morphological neural network approach to extract vehicle targets from high resolution panchromatic satellite imagery. In the approach, the morphological shared-weight neural network (MSNN) is used to classify image pixels on roads ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SUMAC'20: Proceedings of the 2nd Workshop on Structuring and Understanding of Multimedia heritAge Contents

October 2020

70 pages

ISBN:9781450381550

DOI:10.1145/3423323

General Chairs:
Valerie Gouet-Brunet
Univ. Gustave Eiel, IGN-ENSG/LASTIG, France
,
Margarita Khokhlova
Univ. Gustave Eiel, IGN-ENSG/LASTIG, France, Centrale Lyon/LIRIS, France
,
Ronak Kosti
Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
,
Liming Chen
Centrale Lyon/LIRIS, France
,
Xu-Cheng Yin
University of Science and Technology Beijing, China

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '20

Sponsor:

SIGMM

MM '20: The 28th ACM International Conference on Multimedia

October 12, 2020

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 5 of 6 submissions, 83%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
260
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)2

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Münster SMaiwald Fdi Lenardo IHenriksson JIsaac AGraf MBeck COomen J(2024)Artificial Intelligence for Digital Heritage Innovation: Setting up a R&D Agenda for EuropeHeritage10.3390/heritage70200387:2(794-816)Online publication date: 6-Feb-2024
https://doi.org/10.3390/heritage7020038
Münster SMaiwald FBruschke JKröber CSun YDworak DKomorowicz DMunir IBeck CMünster D(2024)A Digital 4D Information System on the World Scale: Research Challenges, Approaches, and Preliminary ResultsApplied Sciences10.3390/app1405199214:5(1992)Online publication date: 28-Feb-2024
https://doi.org/10.3390/app14051992
Chen XYu ASun QGuo WXu QWen B(2024)Updating road maps at city scale with remote sensed images and existing vector mapsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.3375807(1-1)Online publication date: 2024
https://doi.org/10.1109/TGRS.2024.3375807
Xu QLong CYu LZhang C(2023)Road Extraction With Satellite Images and Partial Road MapsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2023.326133261(1-14)Online publication date: 2023
https://doi.org/10.1109/TGRS.2023.3261332
N SPriyanka Lal SNalini JReddy CDell'Acqua F(2023)DPPNet: An Efficient and Robust Deep Learning Network for Land Cover Segmentation From High-Resolution Satellite ImagesIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2022.31824147:1(128-139)Online publication date: Feb-2023
https://doi.org/10.1109/TETCI.2022.3182414
Chouai MDolezel P(2023)CSU-Net: Contour Semantic Segmentation Self-Enhancement for Human Head DetectionIEEE Access10.1109/ACCESS.2022.323341911(987-999)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2022.3233419
Das PChand SSingh NSingh P(2023)Automated Road Extraction from Remotely Sensed Imagery using ConnectNetJournal of the Indian Society of Remote Sensing10.1007/s12524-023-01747-451:10(2105-2120)Online publication date: 8-Sep-2023
https://doi.org/10.1007/s12524-023-01747-4
Jie YHe HXing KYue ATan WYue CJiang CChen X(2022)MECA-Net: A MultiScale Feature Encoding and Long-Range Context-Aware Network for Road Extraction from Remote Sensing ImagesRemote Sensing10.3390/rs1421534214:21(5342)Online publication date: 25-Oct-2022
https://doi.org/10.3390/rs14215342
Abhishek RChakravorty AChakraborty S(2022)Active learning based semantic segmentation for extraction of minute objects from multispectral satellite imagesIGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium10.1109/IGARSS46834.2022.9884592(7274-7277)Online publication date: 17-Jul-2022
https://doi.org/10.1109/IGARSS46834.2022.9884592
Zhang MSingh HChok LChunara R(2022)Segmenting across places: The need for fair transfer learning with satellite imagery2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW56347.2022.00329(2915-2924)Online publication date: Jun-2022
https://doi.org/10.1109/CVPRW56347.2022.00329

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents