skip to main content
research-article

Heritage Iconographic Content Structuring: from Automatic Linking to Visual Validation

Published: 31 July 2024 Publication History

Abstract

This article presents a global framework dedicated to the structuring of iconographic heritage collections. To alleviate the poor interlinking both between collections and contents, a first step of automatic linking exploiting content-based image retrieval approaches is evaluated and adapted to the visual variability of such heritage contents. To ensure understanding and analysis of the contents in a structured fashion, a 3D immersive web platform is also introduced alongside visual-based analysis tools. Finally, by exploiting both automatic linking and manual interventions in the visualization platform, an iterative, semi-automatic structuring pipeline is proposed to solve difficult cases missed by automatic structuring, and then improve structuring optimally. Here, we demonstrate the potential of the proposal on the geographic iconographic heritage of Paris, with a dataset of 10k images belonging to several institutions, thus poorly connected nor organized globally.

References

[1]
[2]
Gene M. Amdahl. 1967. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, Spring Joint Computer Conference. 483–485.
[3]
Artem Babenko and Victor Lempitsky. 2015. Aggregating local deep features for image retrieval. In Proceedings of the International Conference on Computer Vision. 1269–1277. DOI:
[4]
Song Bai, Peng Tang, Philip H. S. Torr, and Longin Jan Latecki. 2019. Re-ranking via metric fusion for object retrieval and person re-identification. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 740–749. DOI:
[5]
Daniel Barath, Jiri Matas, and Jana Noskova. 2019. MAGSAC: Marginalizing sample consensus. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10197–10205.
[6]
Nicolas Blanc, Timothée Produit, and Jens Ingensand. 2018. A semi-automatic tool to georeference historical landscape images. PeerJ 6 (2018), 1–7. DOI:
[7]
Emile Blettery and Valérie Gouet-Brunet. 2023. Re-ranking image retrieval in challenging geographical iconographic heritage collections. In Proceedings of the International Conference on Content-Based Multimedia Indexing. 1–7.
[8]
Emile Blettery and Valérie Gouet-Brunet. 2024. Platform Demonstration and Visual Results. Retrieved from https://www.umr-lastig.fr/emile-blettery/results.html
[9]
Emile Blettery, Paul Lecat, Alexandre Devaux, Valérie Gouet-Brunet, Frédéric Saly-Giocanti, Mathieu Brédif, Laetitia Delavoipière, Sylvaine Conord, and Frédéric Moret. 2020. A spatio-temporal web application for the understanding of the formation of the Parisian metropolis. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 6, 4/W1 (2020), 45–52. DOI:
[10]
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (2008), P10008.
[11]
Ulrik Brandes and Christian Pich. 2007. Centrality estimation in large networks. International Journal of Bifurcation and Chaos 17, 07 (2007), 2303–2318. DOI:
[12]
Bingyi Cao, André Araujo, and Jack Sim. 2020. Unifying deep local and global features for image search. In European Proceedings of the Conference on Computer Vision, Vol. 12365. 726–743. DOI:
[13]
Wei Chen, Yu Liu, Weiping Wang, Erwin M. Bakker, Theodoros Georgiou, Paul Fieguth, Li Liu, and Michael S. Lew. 2022. Deep learning for instance retrieval: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45-6, 7270–7292.
[14]
Ondřej Chum, James Philbin, Josef Sivic, Michael Isard, and Andrew Zisserman. 2007. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the International Conference on Computer Vision, 1–8. DOI:
[15]
Richard Cyganiak, David Wood, Markus Lanthaler, Graham Klyne, Jeremy J. Carroll, and Brian McBride. 2014. RDF 1.1 concepts and abstract syntax. W3C recommendation 25, 02 (2014), 1–22.
[16]
Agni Delvinioti, Hervé Jégou, Laurent Amsaleg, and Michael E Houle. 2014. Image retrieval with reciprocal and shared nearest neighbors. In Proceedings of the International Conference on Computer Vision Theory and Applications, Vol. 2. 321–328. DOI:
[17]
Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2018. SuperPoint: Self-supervised interest point detection and description. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops. 224–236. DOI:
[18]
Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, and Baining Guo. 2022. Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 12124–12134. DOI:
[19]
Douglas Duhaime. 2017. PixPlot Visualization Platform. Retrieved from https://dhlab.yale.edu/projects/pixplot/
[20]
Martin A. Fischler and Robert C. Bolles. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24, 6 (1981), 381–395. DOI:
[21]
French Culture Ministry. 2019. Base Mémoire, French Culture Ministry’s Platform for Heritage Content. Retrieved from https://www.pop.culture.gouv.fr/
[22]
French Mapping Agency. 2016. Remonter le temps, French Mapping Agency Platform for Heritage Data. Retrieved from https://remonterletemps.ign.fr/
[23]
French National Library. 2015. Gallica, French National Library Website. Retrieved from https://gallica.bnf.fr/
[24]
Florent Geniet, Valérie Gouet-Brunet, and Mathieu Brédif. 2022. ALEGORIA: Joint multimodal search and spatial navigation into the geographic iconographic heritage. In Proceedings of the ACM International Conference on Multimedia. 6982–6984. DOI:
[25]
Albert Gordo, Filip Radenovic, and Tamara Berg. 2020. Attention-based query expansion learning. In Proceedings of the European Conference on Computer Vision, Vol. 12373. 172–188. DOI:
[26]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Vol. 2016-Decem. 770–778. DOI:
[27]
HistoryPin. 2010. HistoryPin Collaborative Platform. Retrieved from https://www.historypin.org/en/
[28]
Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Teddy Furon, and Ondrej Chum. 2017. Efficient diffusion on region manifolds: Recovering small objects with compact CNN representations. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 2077–2086. DOI:
[29]
Vincent Jaillot, Valentin Rigolle, Sylvie Servigne, John Samuel, and Gilles Gesquiére. 2021. Integrating multimedia documents and time-evolving 3D city models for web visualization and navigation. Transactions in GIS 25, 3 (2021), 1419–1438. DOI:
[30]
Seongwon Lee, Hongje Seong, Suhyeon Lee, and Euntai Kim. 2022. Correlation verification for image retrieval. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 5374–5384. DOI:
[31]
Wei-Chao Lin. 2019. Aggregation of multiple pseudo relevance feedbacks for image search re-ranking. IEEE Access 7 (2019), 147553–147559. DOI:
[32]
Wei-Chao Lin. 2022. Block-based pseudo-relevance feedback for image retrieval. Journal of Experimental and Theoretical Artificial Intelligence 34, 5 (2022), 891–903. DOI:
[33]
Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Pollefeys. 2023. LightGlue: Local feature matching at light speed. In Proceedings of the International Conference on Computer Vision, 17627–17638. Retrieved from https://arxiv.org/pdf/2306.13643.pdf
[34]
Liris Laboratory Vcity Team. 2023. Virtual City Project. Retrieved from https://projet.liris.cnrs.fr/vcity/
[35]
Ferdinand Maiwald, Jonas Bruschke, Christoph Lehmann, and Florian Niebling. 2019. A 4D information system for the exploration of multitemporal images and maps using photogrammetry, web technologies and Vr/Ar. Virtual Archaeology Review 10, 21 (2019), 1–13. DOI:
[36]
Ferdinand Maiwald, Christoph Lehmann, and Taras Lazariv. 2021. Fully automated pose estimation of historical images in the context of 4D geographic information systems utilizing machine learning methods. ISPRS International Journal of Geo-Information 10, 11 (2021), 748. DOI:
[37]
Lionel Moisan, Pierre Moulon, and Pascal Monasse. 2016. Fundamental matrix of a stereo pair, with a contrario elimination of outliers. Image Processing on Line 6 (2016), 89–113.
[38]
Navilium. 2016. Navilium Collaborative Platform. Retrieved from https://www.navilium.com/
[39]
Hyeonwoo Noh, Andre Araujo, Jack Sim, Tobias Weyand, and Bohyung Han. 2017. Large-scale image retrieval with attentive deep local features. In Proceedings of the International Conference on Computer Vision, Vol. 2017-Octob. 3476–3485. DOI:
[40]
Jianbo Ouyang, Hui Wu, Min Wang, Wengang Zhou, and Houqiang Li. 2021. Contextual similarity aggregation with self-attention for visual re-ranking. Advances in Neural Information Processing Systems 34, 3135–3148. Retrieved from https://proceedings.neurips.cc/paper/2021/hash/18d10dc6e666eab6de9215ae5b3d54df-Abstract.html
[41]
Shanmin Pang, Jin Ma, Jianru Xue, Jihua Zhu, and Vicente Ordonez. 2019. Deep feature aggregation and image re-ranking with heat diffusion for image retrieval. Transactions on Multimedia 21, 6 (2019), 1513–1523. DOI:
[42]
Nicolas Paparoditis, Jean-Pierre Papelard, Bertrand Cannelle, Alexandre Devaux, Bahman Soheilian, Nicolas David, and Erwann Houzay. 2014. Stereopolis II: A multi-purpose and multi-sensor 3D mobile mapping system for street visualisation and 3D metrology. Revue Française de Photogrammétrie et de Télédétection 200 (Apr. 2014), 69–79. DOI:
[43]
Noé Pion, Martin Humenberger, Gabriela Csurka, Yohann Cabon, and Torsten Sattler. 2020. Benchmarking image retrieval for visual localization. In Proceedings of the International Conference on 3D Vision. 483–494. DOI:
[44]
Vikus Project. 2014-2017. Vikus Project’s Interactive Demos. Retrieved from https://uclab.fh-potsdam.de/vikus/
[45]
Filip Radenovic, Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. 2018. Revisiting Oxford and Paris: Large-scale image retrieval benchmarking. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 5706–5715. DOI:
[46]
Filip Radenović, Giorgos Tolias, and Ondřej Chum. 2016. CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples. In Proceedings of the European Conference on Computer Vision. 3–20. DOI:
[47]
Filip Radenovic, Giorgos Tolias, and Ondrej Chum. 2019. Fine-tuning CNN image retrieval with no human annotation. Transactions on Pattern Analysis and Machine Intelligence 41, 7 (2019), 1655–1668. DOI:
[48]
Marko A. Rodriguez and Peter Neubauer. 2010. Constructions from dots and lines. Bulletin of the American Society for Information Science and Technology 36, 6 (2010), 35–41.
[49]
John Samuel, Vincent Jaillot, Clément Colin, Diego Vinasco Alvarez, Eric Boix, Sylvie Servigne, and Gilles Gesquière. 2023. UD-SV: Urban data services and visualization framework for sharing multidisciplinary research. Transactions in GIS, Vol. 27-3, 841–858. DOI:
[50]
Paul Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2020. SuperGlue: Learning feature matching with graph neural networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 4938–4947. DOI:
[51]
Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-motion revisited. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 4104–4113. DOI:
[52]
Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise view selection for unstructured multi-view stereo. In Proceedings of the European Conference on Computer Vision, Vol. 9907. 501–518. DOI:
[53]
Xi Shen, Yang Xiao, Hu Shell Xu, Othman Sbai, and Mathieu Aubry. 2021. Re-ranking for image retrieval and transductive few-shot classification. In Proceedings of the Advances on Neural Information Processing Systems. 25932–25943. Retrieved from https://proceedings.neurips.cc/paper/2021/hash/d9fc0cdb67638d50f411432d0d41d0ba-Abstract.html
[54]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations. 1–14. Retrieved from https://arxiv.org/pdf/1409.1556.pdf
[55]
Yafei Song, Xiaowu Chen, Xiaogang Wang, Yu Zhang, and Jia Li. 2016. 6-DOF image localization from massive geo-tagged reference images. Transactions on Multimedia 18, 8 (2016), 1542–1554. DOI:
[56]
Matthias Springstein, Stefanie Schneider, Javad Rahnama, Eyke Hüllermeier, Hubertus Kohle, and Ralph Ewerth. 2021. iART: A search engine for art-historical images to support research in the humanities. In Proceedings of the 29th ACM International Conference on Multimedia. 2801–2803.
[57]
Swiss Art Research Infrastructure. 2022. Images of Switzerland Online. Retrieved from https://www.timemachine.eu/images-of-switzerland-online/
[58]
Fuwen Tan, Jiangbo Yuan, and Vicente Ordonez. 2021. Instance-level image retrieval using reranking transformers. In Proceedings of the International Conference on Computer Vision. 12105–12115. DOI:
[59]
Giorgos Tolias, Yannis Avrithis, and Hervé Jégou. 2016. Image search with selective match kernels: aggregation across single and multiple images. International Journal of Computer Vision 116 (2016), 247–261. DOI:
[60]
Giorgos Tolias, Tomas Jenicek, and Ondřej Chum. 2020. Learning and aggregating deep local descriptors for instance-level recognition. In Proceedings of the European Conference on Computer Vision, Vol. 12346. LNCS, 460–477. DOI:
[61]
Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. 2019. From Louvain to Leiden: Guaranteeing well-connected communities. Scientific Reports (Nature) 9, 1 (2019), 5233.
[62]
Nicolas Verdier, Eric Mermet, and Carmen Brando. 2017. Oronce Fine platform. https://psigehess.hypotheses.org/oronce-fine
[63]
Qi Wang, Weidong Min, Daojing He, Song Zou, Tiemei Huang, Yu Zhang, and Ruikang Liu. 2020. Discriminative fine-grained network for vehicle re-identification using two-stage re-ranking. Science China Information Sciences 63 (2020), 1–12. DOI:
[64]
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2021. Pyramid Vision Transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the International Conference on Computer Vision. 568–578. DOI:
[65]
Tobias Weyand, André Araujo, Bingyi Cao, and Jack Sim. 2020. Google landmarks dataset v2 A large-scale benchmark for instance-level recognition and retrieval. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 2575–2584. DOI:
[66]
Florian Windhager, Paolo Federico, Günther Schreder, Katrin Glinka, Marian Dörk, Silvia Miksch, and Eva Mayr. 2018. Visualization of cultural heritage collection data: State of the art and future challenges. IEEE Transactions on Visualization and Computer Graphics 25, 6 (2018), 2311–2330.
[67]
Min Yang, Dongliang He, Miao Fan, Baorong Shi, Xuetong Xue, Fu Li, Errui Ding, and Jizhou Huang. 2021. DOLG: Single-stage image retrieval with deep orthogonal fusion of local and global features. In Proceedings of the International Conference on Computer Vision. 11772–11781. DOI:
[68]
Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, and Alexander Smola. 2020b. ResNeSt: Split-attention networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops. 2736–2746. DOI:
[69]
Xuanmeng Zhang, Minyue Jiang, Zhedong Zheng, Xiao Tan, Errui Ding, and Yi Yang. 2020. Understanding image retrieval re-ranking: A graph neural network perspective. arXiv: 2012.07620. Retrieved from https://arxiv.org/abs/2012.07620
[70]
Xulu Zhang, Zhenqun Yang, Hao Tian, Qing Li, and Xiaoyong Wei. 2022. Indicative image retrieval: Turning blackbox learning into grey. arXiv: 2201.11898. Retrieved from https://arxiv.org/abs/2201.11898
[71]
Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, and Heng Wang. 2023. \(R^{2}\) former: Unified \(R\)etrieval and \(R\)eranking transformer for place recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 19370–19380.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal on Computing and Cultural Heritage
Journal on Computing and Cultural Heritage   Volume 17, Issue 3
September 2024
382 pages
EISSN:1556-4711
DOI:10.1145/3613582
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 July 2024
Online AM: 24 May 2024
Accepted: 20 May 2024
Revised: 23 April 2024
Received: 25 January 2024
Published in JOCCH Volume 17, Issue 3

Check for updates

Author Tags

  1. Image retrieval
  2. re-ranking
  3. data linking
  4. graph visualization
  5. geographical iconographic heritage

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 211
    Total Downloads
  • Downloads (Last 12 months)211
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media