NLP-Based Fusion Approach to Robust Image Captioning
IEEE Journal of Selected Topics in Applied Earth Observations and …, 2024•ieeexplore.ieee.org
Robustness in remote sensing image captioning is crucial for real-world applications.
However, most of the research focuses on improving the performance of single captioning
algorithms, either by introducing novel feature processing units or meta-tasks that indirectly
improve the captioning performance. Despite indisputable improvements in performance,
we argue that relying on the output of a single model can be critical, especially when data
scarcity limits the generalization capability of the trained algorithms. Focusing on the …
However, most of the research focuses on improving the performance of single captioning
algorithms, either by introducing novel feature processing units or meta-tasks that indirectly
improve the captioning performance. Despite indisputable improvements in performance,
we argue that relying on the output of a single model can be critical, especially when data
scarcity limits the generalization capability of the trained algorithms. Focusing on the …
Robustness in remote sensing image captioning is crucial for real-world applications. However, most of the research focuses on improving the performance of single captioning algorithms, either by introducing novel feature processing units or meta-tasks that indirectly improve the captioning performance. Despite indisputable improvements in performance, we argue that relying on the output of a single model can be critical, especially when data scarcity limits the generalization capability of the trained algorithms. Focusing on the advantages of ensembles for improving robustness, we propose different ways to select or generate a single most coherent caption from a set of predictions made by different captioning algorithms. The disjunction between the two phases of prediction and selection/generation provides high flexibility for inserting different captioning algorithms, each with its peculiarities and strengths. In this context, based on neural natural language processing tools, our approach can be considered as an additional fusion block that enables higher robustness with a contained complexity burden.
ieeexplore.ieee.org
Showing the best result for this search. See all results