A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests

Dudek, Adrian; Stütz, Peter

doi:10.3390/drones9010055

Open AccessArticle

A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests

by

Adrian Dudek

^*

and

Peter Stütz

Institute of Flight Systems, University of the Bundeswehr Munich, 85577 Neubiberg, Germany

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(1), 55; https://doi.org/10.3390/drones9010055

Submission received: 6 December 2024 / Revised: 10 January 2025 / Accepted: 13 January 2025 / Published: 15 January 2025

(This article belongs to the Special Issue Flight Control and Collision Avoidance of UAVs)

Download

Browse Figures

Versions Notes

Abstract

:

In order to contribute to the operation of unmanned aerial vehicles (UAVs) according to visual flight rules (VFR), this article proposes a monocular approach for cloud detection using an electro-optical sensor. Cloud avoidance is motivated by several factors, including improving visibility for collision prevention and reducing the risks of icing and turbulence. The described workflow is based on parallelized detection, tracking and triangulation of features with prior segmentation of clouds in the image. As output, the system generates a cloud occupancy grid of the aircraft’s vicinity, which can be used for cloud avoidance calculations afterwards. The proposed methodology was tested in simulation and flight experiments. With the aim of developing cloud segmentation methods, datasets were created, one of which was made publicly available and features 5488 labeled, augmented cloud images from a real flight experiment. The trained segmentation models based on the YOLOv8 framework are able to separate clouds from the background even under challenging environmental conditions. For a performance analysis of the subsequent cloud position estimation stage, calculated and actual cloud positions are compared and feature evaluation metrics are applied. The investigations demonstrate the functionality of the approach, even if challenges become apparent under real flight conditions.

Keywords:

UAV cloud detection; sense and avoid; cloud position estimation; flight experiments

1. Introduction

The fields of application for unmanned aerial vehicles (UAVs) are manifold, covering a wide spectrum from unmanned air cargo [1] to Search and Rescue (SAR) [2]; the inspection of critical infrastructure [3,4,5]; and intelligence, surveillance and reconnaissance (ISR) missions [6]. Many of these applications utilize lower airspace, where the main weather activity occurs and where air traffic is expected that operates according to visual flight rules (VFR). Regardless of the operational purpose of a UAV, avoiding mid-air collisions with other manned or unmanned air traffic participants is a top priority for operational flight safety. A key factor in reducing the risk of collision is ensuring own visibility, which is no longer guaranteed after entering clouds. Therefore, Sense and Avoid (SAA) systems are being developed and researched. In accordance with current air traffic regulations, VFR flights are permitted only if certain minimum distances to clouds can be maintained [7]. Under the assumption that UAVs will have to comply with visual flight rules, an on-board cloud detection and avoidance system is required. An additional reason motivating cloud detection and avoidance is the increased probability of icing in the vicinity of clouds. Moreover, updrafts and downdrafts are more severe in the vicinity of clouds, particularly near convective clouds, which can limit safe UAV operation. Considering reconnaissance scenarios with electro-optical (EO) sensors, the reconnaissance performance is degraded by clouds and relocating the mission to other cloud-free areas may be of advantage.

Motivated by the aforementioned reasons and the lack of relevant research in this area to date, the Institute of Flight Systems of the University of the Bundeswehr Munich (UniBwM) is investigating respective methods for the EO-sensor-based detection of clouds in order to contribute to the safe integration of UAVs into airspace. The scope of our research activities is limited to the detection of clouds. Accordingly, additional investigations of avoidance algorithms are necessary for the development of an operational cloud detection and avoidance system. The proposed and developed monocular approach is based on a multi-stage chain of cloud segmentation, cloud position estimation and the generation of cloud occupancy grids. The algorithms are applied and tested in a simulation environment and in real flight tests. In the field of semantic segmentation, deep neural networks have made significant contributions in recent years. For supervised learning of these networks with the purpose of cloud segmentation, annotated training data are required. According to the authors’ knowledge, only cloud datasets containing clouds from a ground-based [8] or satellite-based [9,10] perspective are publicly available to date. Consequently, we have created a dataset containing 5488 annotated and augmented sensor images of clouds collected in-flight. In order to promote research in the field of airborne, electro-optical cloud detection, the MissionLabAirborneDataset-Clouds (MLAD-C) dataset has been made publicly available for download. (Refer to the Data Availability Statement at the end of this article).

In conclusion, this article provides the following additional scientific contributions:

The extension of a monocular approach for cloud position estimation with 3D cloud occupancy grids.
The presentation of an evaluation concept resulting in two experimental setups:
- A simulation environment optimized in terms of cloud detection.
- A flight test bed as UAV surrogate based on a very light aircraft.
Quantitative analysis of the proposed approach by comparing the calculated cloud position with the actual cloud position and analyzing feature metrics.
The provision of an annotated and augmented cloud dataset from the aerial perspective.

1.1. Related Work

In the context of sensor-based cloud detection, the focus of previous research has mainly been on ground-based and satellite-based applications. Numerous studies on cloud segmentation exist, for example, with the aim of determining the degree of cloud cover. In contrast, there are only a few scientific studies concerning cloud position estimation. Studies estimating cloud position from an airborne perspective are found in [11,12]; hence, these are of particular relevance.

1.1.1. Cloud Segmentation

In the domain of cloud segmentation, the approaches can be subdivided into classic image processing methods based on features, such as color values, intensities or texture, based on machine learning methods and, as part of this, based on deep learning methods. Early studies used the ratio of red and blue components to distinguish clouds in ground-based images by thresholding [13]. While clouds scatter red and blue evenly, the red-to-blue ratio is small for a cloud-free sky, but it is larger in the area of the sun or the horizon [14]. Expanding on the findings from [14], ref. [15] formed a normalized blue-to-red image that classified the images as unimodal or bimodal based on a threshold of the standard deviation. In the unimodal case, a fixed threshold value was applied, whereas in the bimodal case an adaptive minimum cross entropy threshold algorithm was used [15]. Further methods of classical image processing for cloud segmentation applied on aerial images were described in [16,17]. Thereby, a significance map was created based on the hue and the intensity values, on whose histogram the Otsu threshold method [18] was employed, whereby certain value ranges for the hue and intensity value must be fulfilled to classify a pixel as a cloud pixel [16]. Clouds show significantly less detail than the complex ground structure, which is why a detail map was created that was used for classification and combined with the previously generated significance map to obtain the segmentation result [16]. Ref. [17] built on the methodology of [16], with the main differences being in the formation of the detail map and the handling of semi-transparent cloud edge areas.

Analogous to the cloud-segmenting algorithms from classical image processing, there are several methods in the field of machine learning that use color components and combinations to segment clouds, such as in [19] for a ground-based camera, where various red–blue combinations were also in focus and contributed to the segmentation result. Using a principal component analysis and the receiver operating characteristics, color components and combinations were prioritized for cloud segmentation and then subsequently classified using partial least squares regression [19]. With the purpose of performing ground-based cloud segmentation at night in [20], color components were first extracted, followed by a logical classification into superpixels using the Simple Linear Iterative Clustering (SLIC) algorithm [21]. Finally, a k-means clustering method was chosen to classify these superpixels. This clustering method was also used in [22] to distinguish clouds from the ground based on the values of saturation and intensity. Another work, which also used the k-means algorithm, first divided the image into subimages and extracted gray-scale, frequency and texture features within these subimages as input for the classifier [23]. With the aim of segmenting clouds in aerial images, [24] compared the application of three classification methods (AdaBoostM1, Multilayer Perceptron and Random Forest) using image moments.

In recent years, the segmentation of clouds using deep neural networks has gained significance. A deep neural network for ground-based cloud segmentation was described in [8]. Other deep networks intended for satellite-based cloud segmentation can be found in [25,26,27]. Additionally, a recent publication was dedicated to the detection of above-aircraft clouds using convolutional neural networks [28].

1.1.2. Cloud Position Estimation

However, in the field of airborne cloud position estimation, only a few studies have been performed to date. The bimodal procedure, explained in [11], dealt with electro-optical cloud detection for High-Altitude Pseudo Satellites (HAPSs) with an almost perpendicular observation angle. The first mode was used to create a large-scale cloud coverage map by scanning a georeferenced grid with a gimbal and, subsequently, the percentage of cloud cover in each grid cell was obtained by segmentation. Estimating the position and dimension of single cloud objects was the objective of the second mode, which was achieved by cloud segmentation and subsequent feature detection, tracking and two-view triangulation [11]. Another relevant research work concerning cloud position estimation was presented in [12]. A core element of the likewise monocular approach to determine cloud positions was the use of the Lucas-Kanade tracker and the subsequent triangulation of the detected and tracked features. In addition, a classification of image patches into clouds, sky or ground was performed based on intensity-related and texture parameters in order to reunite cloud parts of a single cloud that were separated by the previous distance measurement. Once a cluster method was applied, the 3D cloud points were represented as a cylinder, cuboid or convex hull as a last step [12].

1.1.3. Cloud Type Classification

Another field of image-based cloud detection is cloud type classification, for example, where the fractal distance is analyzed to differentiate between cirrus and cumulonimbus clouds as shown in [29]. Apart from the identification of a cumulonimbus, which should be avoided over a wider area, cloud type classification on board UAVs is less relevant for Sense and Avoid purposes.

1.1.4. Previous Work at University of the Bundeswehr Munich

Based on earlier investigations of cloud detection in nadir perspective from high altitudes carried out by UniBwM, the approach described in [22] was adapted from the perspective of flights in cloud altitude with a forward-looking perspective and continuously optimized. Thereby, our first investigations were limited to simulative studies and included color and intensity-based segmentation methods, which were prone to mis-segmentation especially in the horizon and ground image areas [30]. Other aspects of our research focused on determining the positions of clouds using two-view triangulation [31] or discussing promising algorithms for cloud detection [32]. Additionally, ref. [33] depicted the advantages of information fusion combining an electro-optical sensor with a weather radar sensor with regard to the detection of weather phenomena. Our first considerations on how the detection and avoidance of clouds can be coupled were addressed in [34]. More recent publications have dealt with the execution of flight experiments and the behavior of the cloud detection approach under real flight conditions [35,36,37].

1.2. Outline

The following section proposes a workflow for cloud detection based on application-related requirements and meteorological considerations. Subsequently, the evaluation concept including two experimental setups for developing and testing of the methodology are presented (Section 2.4). Finally, a quantitative analysis of the approach in simulation is conducted and initial results from flight experiments are demonstrated (Section 3). This is followed by a summary of the findings and an outlook for future developments (Section 4).

2. Materials and Methods

The proposed workflow for the detection and position estimation of clouds (Section 2.3) is based on our earlier concepts [35], which have been refined through several flight tests. Moreover, the approach is derived from requirements and meteorological considerations, which are explained hereinafter.

2.1. Requirements

The requirements are derived from the demand of two mission scenarios for which an on-board cloud detection and avoidance functionality could bring significant advantages.

Cargo UAVs operating in lower, uncontrolled airspace with high VFR traffic volumes.
Tactical ISR UAVs operating in areas with increased thunderstorm activity.

In the first case, the detection of clouds is primarily motivated by regulatory requirements with regard to collision avoidance. However, in the second case, the focus is on increasing mission efficiency, as the electro-optical reconnaissance performance benefits from operating in cloud-free areas. Both use cases impose similar requirements on the cloud detection functionality, as summarized in Table 1.

2.1.1. Performance Requirements

Of central importance is the requirement to provide a reliable cloud position estimate as long as an avoidance maneuver is still physically possible. This requirement is formulated in the following using a simplified example, assuming a UAV approaches the center of a stationary cumulus cloud head-on at a constant airspeed and altitude with a maximum bank angle of 30°. A constant speed of 70 knots is assumed, which is a common value for platforms in both application scenarios based on publicly available specifications from manufacturers [38,39]. The horizontal diameter of the cumulus is assumed to be 1.5 km, since investigations presented in [40] have determined that the majority of cumuli have a mean diameter of between 1 and 2 km. In [41], the critical relative distance is specified at which an avoidance maneuver must be initiated immediately. To calculate this critical distance between UAV and the cloud center

d_{u c}

under the above assumptions, the equations presented in [41] are taken and adapted. First, the minimum turn radius

ρ_{m i n}

is calculated (Equation (1)) using uav velocity v, gravitational acceleration g and maximum bank angle

ϕ_{m a x}

. Subsequently, the critical distance from the UAV to the cloud center

d_{u c}

is computed using

ρ_{m i n}

and cloud radius

r_{c l o u d}

(Equation (2)).

ρ_{\min} = \frac{v^{2}}{g tan (ϕ_{\max})}

(1)

d_{uc} = ρ_{\min} + r_{cloud}

(2)

Taking the above assumptions into account, this results in a critical distance of 980.22 m in airspaces without a legal minimum horizontal cloud distance and 2480.22 m in airspaces with a minimum legal horizontal cloud distance of 1.5 km. This implies that a robust cloud position estimate must be available before these critical distances are reached. The estimation of cloud positions for very large distances (>15 km) is not necessary, as the cloud situation can change fundamentally until these distances are covered at the assumed velocity.

2.1.2. System Requirements

Another design-characteristic requirement concerns the UAV payload capacities. These are limited, which is why the hardware components required for cloud detection should be small in size, weight and power consumption at a low cost (SWaP-C principle), favored by the use of a passive electro-optical sensor. In addition, the cloud Sense and Avoid functionality should be realized in a platform-independent and adaptable manner, meaning that the approach should also be able to cope with different resolutions or with single and multi-channel images. Furthermore, the option of extending the approach to use an infrared sensor shall remain open in order to increase the availability of the system even in difficult lighting conditions. The use of monocular rather than stereoscopic approaches for cloud position estimation is preferable for several reasons. Multiple sensor units at different mounting points contradict the SWaP-C principle. In addition, the functional ranges of over 2.5 km require large baselines for stereoscopic methodologies, which are limited by the wingspan of the UAVs. Finally, the output of the cloud detection system must map the surrounding cloud situation in real time, allowing avoidance decisions to be calculated and cloud dynamics (formation, dissipation, displacement) to be taken into account.

2.2. Meteorological Aspects

Since clouds have a number of special attributes that are relevant for detection using image-processing algorithms, these meteorological aspects are summarized below and incorporated into the proposed workflow. When comparing the detection of clouds to other objects, such as pedestrians or cars, several differences emerge. Clouds can adopt semi-transparent states, which are frequent especially in cloud edge areas [16], and form or dissipate in a short period of time at a specific location. Additionally, it is difficult to find similar, recurring cloud characteristics, as would be the case for other objects (pedestrians: eyes, arms; cars: tires, windshield). Although, according to [42], there are ten main cloud groups (genera) which are subdivided into species, there exists an infinite variety of forms. Table 2 provides an overview of cloud properties that are relevant to image processing and should be taken into account.

A factor that has a major influence on the determination of cloud distance is cloud dynamics. It can be observed that several clouds usually occur in one altitude layer, whose horizontal movement correlates. In addition, clouds of convective origin in particular can have vertical velocity components due to updrafts and downdrafts or vortices. If there are several clouds at different altitudes, the direction and speed of movement between the layers can be uncorrelated, meaning that windshear is present.

In terms of visual appearance, clouds show an achromatic appearance [19], with a red-to-blue ratio that differs significantly from the sky [14]. Moreover, cloud pixels have high intensity values due to their high reflectivity compared to other objects [16]. From a textural point of view, clouds show fewer details compared to the ground and the homogeneity increases the further a cloud area is located from the edge of the cloud [12,16].

Ultimately, with the appropriate image resolution, clouds do not appear as individual scattered pixels, but always in neighboring pixel groups [16]. In this respect, it can be stated that very small or semi-transparent clouds are not significant for UAV Sense and Avoid.

2.3. Proposed Approach

Essentially, the proposed approach (illustrated in Figure 1) is founded on cloud segmentation and subsequent cloud position estimation using feature detection, tracking and triangulation. In addition, various filtering steps are conducted to remove two-dimensional features and three-dimensional, triangulated cloud points that do not fulfill certain filter criteria. The system architecture is designed modularly, is rooted in the open-source middleware ROS2 Humble Hawksbill and the individual nodes are implemented in C++ and Python. The updated rate of the approach depends on the UAV airspeed and the selected triangulation baseline configuration, but typical values range between 0.13 Hz and 3 Hz.

In the following, Section 2.3.1, Section 2.3.2, Section 2.3.3, Section 2.3.4, Section 2.3.5, Section 2.3.6, Section 2.3.7 and Section 2.3.8, the individual eight process steps are described in detail. The procedure is based on previous work [35] and has been optimized by experiments, so that the additional scientific contribution in this article is the extension of the approach by the generation of cloud occupancy grids (Section 2.3.8).

2.3.1. Image and Metadata Acquisition

Image sources are provided by either a simulation environment or the sensor stream from flight experiments. Both experimental setups are described in detail in Section 2.4. The subscribed ROS2 sensor stream consists of the sensor image and the sensor metadata. In turn, the metadata contain essential information for the cloud detection workflow, such as the intrinsic camera parameters, distortion parameters, sensor position, gimbal orientation, platform orientation and a timestamp. The position and orientation (referred to as pose) of the sensor are initially provided in different coordinate systems. Therefore, they are merged and transformed into the camera coordinate system for further processing. Consequently, a separate transform broadcaster node based on the tf2 library is implemented, which is capable of transforming poses between geodetic, earth-centered-earth-fixed (ECEF), north-east-down (NED), vehicle-carried NED, body, gimbal and camera coordinate systems.

2.3.2. Cloud Segmentation

The purpose of the cloud segmentation step is to separate the cloud from the image background and to identify image areas in which feature detection and tracking are subsequently applied. Furthermore, this step clarifies if clouds are present in the sensor’s field of view at all. Depending on the experimental setup, a deep learning segmentation model based on YOLOv8l-seg is used, as this recent framework is capable of performing in real time, has achieved promising results in segmentation tasks and supports training with custom datasets, along with providing comprehensive documentation [43,44]. The models were developed using training data either from the simulation or from flight tests. Their results can be found in Section 3.1.3. Segmentation in advance of feature detection is considered essential to force the feature detectors and trackers to image areas that are potentially assigned to a cloud. Otherwise, features are mainly detected on the ground, regardless of the detector deployed, as Figure 2 demonstrates.

2.3.3. Ensemble Feature Detection and Tracking

Following the segmentation, prominent cloud features are identified within the segmented areas using feature detectors. The more cloud features can be recognized, the more three-dimensional cloud point positions are available after triangulation. For this reason, the feature detectors are applied to an image ensemble consisting of unfiltered and filtered (contrast-enhancing, blurring filters) sensor recordings. In addition, several detectors are used in combination, whereby a combination of Shi–Tomasi and ORB detectors produces numerous features for the detection of cloud features. The feature detectors were parameterized on the basis of a feature detection benchmark described in [35]. Feature detectors and trackers were not applied to every frame. Instead, individual sample frames are extracted as soon as a certain minimum distance (referred to as triangulation baseline) has been covered from the last sample frame. There are several aspects to consider in the choice of this triangulation baseline. If the triangulation baseline selected is too large, the time interval between the sample frames will increase, so that the cloud dynamics and the wind-induced cloud displacement will have a greater impact on the position estimation [45]. Moreover, the update rate at which the system outputs cloud position information is reduced. If the baseline is set too low, changes from one sample frame to the next are very small and triangulation becomes less accurate. However, a short triangulation baseline results in higher update rates, but these are limited by the computational resources available, which means that the real-time capability is lost if one iteration of the cloud detection process takes longer than the time interval between the sample frames. Another aspect is related to the velocity of the UAV, as high velocities cause motion blur, which in turn can have a negative effect on feature detection. The parallelized detection and tracking of cloud features is shown schematically in Figure 1 in the third process step. A new feature track is initiated at each time step of a sample frame and features are detected in the current sample frame. If there exist feature tracks with sample frames from the past, the system tries to recognize these features in the current sample frame using the Lucas–Kanade tracker. A configuration file defines the maximum number of feature tracks and the maximum number of sample frames (perspectives) per feature track. Feature tracks are deleted if the maximum number of sample frames is reached or if the feature track no longer contains any remaining features, for example, because the tracking step was unable to localize them or because the remaining features were eliminated by the subsequent filter steps. This approach offers the advantage that feature tracks exist at the same time that either contain few perspectives but many features or that contain many perspectives but fewer features.

2.3.4. Two-Dimensional Feature Filtering in Image Plane

In order to reduce the impact of false-positive segmentation, in each triangulation step it is checked whether the tracked features are still within the most recently segmented cloud mask; if not, these features are eliminated. After that, the cloud features are undistorted using the distortion parameters obtained by sensor calibration. As a final step prior to triangulation, the homography between the current and the last sample frame is computed in order to discard mis-matched features by the Random Sample Consensus (RANSAC) algorithm.

2.3.5. Triangulation

The triangulation step is executed track-by-track and the projection matrices and the two-dimensional cloud features of each sample frame are provided as input. The sensor pose is given in the camera coordinate system relative to an initial reference pose in each sample frame, from which the extrinsic matrices can be calculated. To compute the projection matrices, the intrinsic camera parameters, which are obtained by sensor calibration, are also required. Ultimately, the features are triangulated using the Structure from Motion module of the OpenCV library [46].

2.3.6. Plausibility Check

Subsequent to triangulation, the triangulated three-dimensional (3D) cloud points are validated for their plausibility. Therefore, the 3D points are projected onto the image plane and the reprojection errors are determined. If the reprojection error of a cloud feature is greater than 10 pixels, the triangulated feature is eliminated. Moreover, if the reprojection results in 2D coordinates outside the sensor image, these cloud points are considered implausible and filtered out as well. The same applies to positions that are farther away than 20 km from the sensor’s current position. The latter is performed for two reasons: first, because the inaccuracy of triangulation increases with distance, and second, because, due to cloud dynamics, the cloud conditions are likely to have changed significantly by the time those distant clouds are reached.

2.3.7. Cluster Analysis

With the aim of removing individual outliers, a cluster analysis of the identified 3D cloud positions is conducted. Given that the number of position clusters is unknown beforehand, the DBSCAN algorithm [47] is a suitable choice for this task.

2.3.8. Cloud Occupancy Grid

As a final step in the workflow, probability-based cloud occupancy grids are generated using the filtered results of the triangulation. The objective of these occupancy grids is to create a real-time representation of the cloud situation, which subsequently provides the decision-making basis for an avoidance unit. Due to the implementation, which is described below, the historical progression of the cloud position estimates in the occupancy grids is considered as well and the effects of individual triangulations are reduced.

A two-dimensional and a three-dimensional occupancy grid are populated using the same approach. Either way, during startup of a ROS2 node a north-east/north-east-down grid is created, with every cell measuring 300 m on each side. Upon receiving updated triangulations, triangulated cloud positions located inside a grid cell are accumulated. Cloud evidence is assumed to be given whenever a cell reaches 20 triangulated cloud positions. If the number of accumulated cloud positions

n_{cloud_pos}

is below this threshold, the current cloud occupancy probability

p_{new} (n, e, d)

is assumed to be linear for a cell of a specific north-east-down index

(n, e, d)

, as shown in Equation (3).

p_{new} (n, e, d) = \{\begin{matrix} \frac{n_{cloud_pos}}{20}, & if \frac{n_{cloud_pos}}{20} \leq 1 \\ 1, & if \frac{n_{cloud_pos}}{20} > 1 \end{matrix}

(3)

In order to reduce the weight of individual measurements, the current cloud occupancy probability is multiplied by a factor of 0.25 and then added to the old probability,

p_{old} (n, e, d)

, if a detection has occurred within this cell. If no detection has occurred in a grid cell, a decay factor of 0.75 is applied to the old cloud probability value, resulting in the updated cloud probablity

p_{updated} (n, e, d)

. This relationship is described using Equation (4).

p_{updated} (n, e, d) = \{\begin{matrix} p_{old} (n, e, d) + p_{new} (n, e, d) * 0.25, & if p_{new} (n, e, d) > 0 \\ p_{old} (n, e, d) * 0.75, & if p_{new} (n, e, d) = 0 \end{matrix}

(4)

Due to this manner of implementation, both the number of positions measured and the historical progression have an influence on the cloud probability within a cell. The 2D occupancy grid is mainly used to visualize the cloud situation and can be interpreted by UAV operators, for example. Plots generated using Matplotlib [48] appear similar to that of a weather radar display in the form of a polar, airplane-fixed coordinate system. Cloud-free cells are white, shaded cells or cells outside the field of view (FOV) are orange and cloud-covered cells are marked in shades of blue, depending on the probability, as shown in Figure 3a. In order to create the 3D occupancy grid, the OctoMap library [49] is used, whose OcTree structure containing the 3D occupancy grid can be visualized using the ROS2 visualization tool RViz, which is illustrated in Figure 3b. Grid cells are shown in green if the cloud probability exceeds 50% and the cloud ground truth volumes (CGTVs) are marked in red (Figure 3b). The 3D occupancy grid is published as a ROS message, on the basis of which avoidance strategies can be derived and respective trajectories calculated.

Ultimately, the proposed workflow is completed by the calculation of evaluation parameters, such as the feature density or feature losses caused by the previously described stages. A detailed explanation of the parameters used to evaluate cloud detection is given in Section 2.4.1.

2.4. Experimental Design

The experimental design provides for cloud segmentation and cloud position estimation to be analyzed separately. First, the evaluation concept is presented, outlining the methods to evaluate segmentation and position estimation. On this basis, the test procedures and the experimental setups are derived.

2.4.1. Evaluation Concept

The segmentation performance is evaluated on the basis of datasets that contain a wide variety of cloud formations and environmental conditions. The goal of the segmentation analysis is to enable conclusions to be drawn concerning the applicability of the developed models for a robust cloud segmentation. A comparison of the segmentation model predictions with the cloud ground truth mask (CGTM) is required, involving annotated validation datasets. By comparing the predictions with the CGTM, the confusion matrix can be constructed, allowing standard evaluation metrics for semantic segmentation such as precision, recall or mean Average Precision (mAP) to be derived.

In contrast, the analysis of cloud position estimation should allow statements about the accuracy of the monocular approach depending on the distance to a cloud. Thereby, the influence of different baselines between the sampling frames and different velocities shall be analyzed. On the one hand, the quality of feature detection, tracking and triangulation is to be evaluated. On the other hand, the estimated cloud positions are to be compared with the true cloud positions. In order to combine the aforementioned analysis criteria, a test procedure is defined in which clouds are approached head-on with their true cloud positions known. Metrics for evaluating feature quality include the number of features and the feature density per segmented cloud pixel during a cloud approach procedure. In addition, the feature losses per process step of the cloud position estimation can provide information about the robustness of the workflow. To assess the accuracy of the position estimate, the cells of the 3D cloud occupancy grid classified as occupied are compared with the cloud ground truth volume (CGTV) by analyzing the amount of occupied cells inside and outside the CGTV. Additionally, the distances between the outlying cells and the nearest CGTV are investigated. Below, a tabular overview of the evaluation concept is given (Table 3).

The previously described evaluation concept defines how the monocular cloud detection approach can be validated, whereupon suitable experimental setups have to be selected and implemented. A simulative investigation of the approach offers the advantage that cloud approaches can be carried out in a reproducible and cost-effective manner. For example, the same cloud scene can be simulated several times in order to investigate the influence of airspeed on the cloud detection procedure. In addition, a large variety of different environmental influences (lighting conditions, terrain, visibility) can be generated in a very short time, which simplifies a quantitative analysis of the cloud segmentation module as well as of feature detection, tracking and triangulation. Nevertheless, a simulation environment cannot fully represent the diversity of cloud attributes and environmental influences, which is why the method shall also be tested in flight experiments. A comprehensive comparison between estimated cloud positions and cloud ground truth volume is not feasible under real flight conditions, as the CGTV is not available during flight experiments. Therefore, a setup capable of providing a CGTV is essential. Following these considerations, the experiments require both a simulation setup and a flight experiment setup.

2.4.2. Experimental Setup: Simulation Environment

With the aim of using a simulation that is optimized for the validation of cloud detection algorithms, a simulation environment was developed that enables the extraction of cloud ground truth volumes. The simulation environment is based on VegaPrime from Presagis, which uses a SilverLining software development kit (SDK) to generate cloud objects. The simulation render loop can be accessed at runtime via an application programming interface (API). Different cloud scenarios can be generated by adjusting the cloud type or the degree of cloud cover. In addition to the positioning of a cloud at a defined location, both the cloud center position and the cloud ground truth volume can be extracted to compare the estimated cloud positions of the proposed method with the actual cloud position, which offers an enormous advantage over the limited possibilities during flight tests. Even the influence of varying environmental conditions such as time of day, haze, terrain or wind can be investigated using this setup.

2.4.3. Experimental Setup: Flight Test Bed

Our experimental platform for investigating cloud-detecting algorithms under real flight conditions is based on a Zlin Savage, an aircraft of the very light aircraft (VLA) class, which is shown in Figure 4a. A pod is mounted under each wing. Underneath the right wing is the power supply pod, which supplies the sensor pod (Figure 4b) on the opposite side. A core element of the sensor pod is a gimbal sensor unit Epsilon 180 equipped with an electro-optical sensor, a Mid-Wave Infrared (MWIR) sensor and a laser range finder. Furthermore, an integrated Inertial Navigation System (INS) provides the orientation of the platform and the gimbal as well as the position. In order to process the sensor stream, an embedded Nvidia Jetson Orin is installed as mission computer, which converts the image and metadata into ROS2 messages, stores them locally and forwards the stream additionally to the C-band Silvus data link. The software components running on the mission computer are containerized, modular and based on the ROS2 middleware. Currently, the proposed cloud detection methodology is operated and evaluated in post-processing rather than on the embedded computer. Migrating this functionality to the Nvidia Jetson is planned for future work. The data link enables real-time transmission of the sensor stream and control of the sensor over several kilometers to a ground control station. In addition, a flight test engineer on board the aircraft can control the sensor and subscribe to the sensor stream using Wi-Fi. A more detailed description of the hardware and software architecture of the research platform is given in [37].

The experimental design of the flight tests provides for frontal cloud approaches, whereby the Global Navigation Satellite System (GNSS) position of the sensor pod is logged when the cloud base is reached and left in order to obtain approximate cloud ground truth information. Flight tests were conducted in the Upper Bavaria region in the German airspace Golf, so that the experimental procedure was performed as close to the clouds as possible.

3. Results

The following results are categorized into cloud segmentation and cloud position estimation, and include investigations conducted in both simulations and flight experiments, with the latter analyzed during post-processing. The used hardware includes an AMD Ryzen 9 5950X Central Processing Unit (CPU) with 16 cores, 128 GB of Random Access Memory (RAM), and an Nvidia GeForce RTX 3090 Graphics Processing Unit (GPU).

3.1. Cloud Segmentation

3.1.1. MissionLabSimulationDataset-Clouds (MLSD-C)

The dataset developed to train a segmentation model for simulated cloud scenarios consists of 22,424 images with associated cloud ground truth masks. For this purpose, the simulation environment was extended by a training mode that automatically creates cloud scenarios and the corresponding cloud mask. In each iteration step, the sensor is randomly placed within the map section. The simulated cloud type varies between cumulus congestus, cumulus mediocris, stratocumulus cloud particles, cirrocumulus and cirrus fibratus. In addition, the degree of cloud cover and the number of active cloud layers and their elevation are adjusted each time. Moreover, environmental conditions such as visibility and day time are varied. The training mode is executed on different maps to increase terrain diversity in the dataset. The extraction of the cloud ground truth mask is performed by blanking out all other simulation objects and applying an intensity threshold in each iteration step.

3.1.2. MissionLabAirborneDataset-Clouds (MLAD-C)

Based on the recordings of flight tests carried out on 12 October 2023, the MLAD-C dataset was created for the development of neural networks with the purpose of cloud segmentation. The annotated, augmented dataset contains 5488 aerial images of various cloud formations with their associated cloud masks. After the machine learning backend of Label Studio combined with the Segment Anything Model [50] was used to annotate the sensor recordings, they were augmented with the help of the albumentations library [51]. The MLAD-C dataset is publicly available (see Data Availabilty Statement).

3.1.3. Segmentation Results

With the aim of converting the datasets into a suitable format for training Ultralytics’ YOLOv8l-seg [43], the resolution of the images was adjusted and a common 70:30 split between training and test data was subsequently selected. Precision, recall and the mean Average Precision (mAP) for various threshold values of the Intersection over Union (IoU) were used as parameters to evaluate the models, which is stated in Table 4. The ratio of the estimated positives to the actual positive pixels is represented by the precision. In contrast, the recall is the ratio of the actual positive pixels to the correctly estimated pixels [52]. Remarkably, the recall value is significantly lower than the precision value for both models. Consequently, it can be stated that image areas classified as clouds by the models are also very likely to actually be clouds. However, in some cases actual cloud areas remain undetected.

Even if the metrics in Table 4 are commonly used to evaluate segmentation models, their informative value in relation to the application of image-based cloud detection is somewhat limited. Due to the wide variety of cloud properties, it is difficult to assess the relevance of a cloud for SAA if it is semi-transparent, for example. The highest priority of the segmentation step is to force the feature detectors and trackers to focus on potential cloud areas and to avoid false segmentation on the ground. These models meet this requirement, as demonstrated in Figure 5 and Figure 6, where simulated (Figure 5a–c) and real flight (Figure 6a–c) recordings and the associated segmentation results are illustrated (Figure 5d–f and Figure 6d–f).

For the model trained on the MLSD-C, three examples with different terrains, visibilities and day times are demonstrated. In principle, the simulated clouds are reliably segmented, even if smaller clouds and cloud edge areas are sometimes not correctly detected (Figure 5d). Even urban areas or fields with light-colored surfaces do not lead to incorrect segmentation on the ground. However, the complexity and variety of real cloud objects can only be generated to a limited extent in the simulation.

The segmentation model, which was developed on the basis of MLAD-C, has the purpose of separating clouds from the background in real flight images. In order to increase the diversity of cloud scenarios, examples of three flight test days are presented in Figure 6. Figure 6a contains a classic cumulus scenario in which the cloud outlines are almost accurately segmented by the model (Figure 6d). Likewise, the hazy horizon image area does not lead to any detections (Figure 6d). On the contrary, Figure 6b was taken over the Bavarian Alps at dusk and contains shadowed cumulus clouds in flight altitude as well as above-lying cirrus clouds. The shadowed clouds of convective origin are recognized by the model, whereas smaller clouds and the cloud edge areas are not recognized accurately, and the cirrus clouds remain undetected (Figure 6e). In the last sensor image (Figure 6c), there is a cumulus cloud over Lake Starnberg in hazy conditions and with overcast conditions above. In this situation, the model is able to separate the individual cloud in the foreground, while the closed cloud cover is not segmented (Figure 6f).

From the perspective of the Sense and Avoid task, it can be assumed that individual clouds in the immediate vicinity at flight altitude are more important than small semi-transparent clouds at a greater distance. With this assumption and the actual goal of applying feature detectors and trackers only to cloud image areas and not to the entire sensor image, both segmentation models are sufficient for the requirements. On average, the segmentation model for the simulation requires 21.3 ms to process a single frame on the simulated sensor stream, while the model for flight experiments requires 23.99 ms per frame on the sensor stream of the gimbal unit.

3.2. Cloud Position Estimation

As introduced in the evaluation concept (Section 2.4.1), the performance of cloud position estimation is evaluated based on feature quality measures and comparisons between cloud position estimates and cloud ground truth data. The following results show the evaluation metrics during a simulated cloud approach scenario under different speed configurations (70 kt, 250 kt) and triangulation baseline configurations (50 m, 200 m). After the quantitative analysis in the simulation, results from a flight experiment are shown.

3.2.1. Simulated Scenario

The analyzed simulated scenario contains two cumulus clouds above (Figure 7), whose cloud center positions are about 15 km apart at the beginning. Clouds are approached head-on until they disappear from the sensor’s FOV. In order to simulate cloud behavior in a realistic a way, the simulated clouds were subjected to a voxel spin so that the cloud particles moved visibly. The simulation environment enables the definition of cloud ground truth volumes in which the cloud is located. The dimensions of the CGTV of the larger cloud are North: 4000 m, East: 4000 m and Down: 1000 m; those of the smaller one are North: 1000 m, East: 1000 m and Down: 1300 m. The evaluation was carried out for two baseline configurations, one for 50 m and the other for a 200 m distance between the sample frames, as presented together with other configuration parameters in Table 5. In order to be able to evaluate the influence of velocity, a constant cloud approach speed of 70 kt and 250 kt was simulated. Briefly, 70 kt corresponds to the cruise speed of UAVs operating in unmanned cargo or reconnaissance missions [38,39] and 250 kt is the speed limitation for several airspace classes below 10,000 ft according to [53].

An analysis of the update rates of the methodology shows that the real-time capability for the 50 m baseline configuration cannot be maintained at 250 kt, so the baseline for this configuration corresponds to an average of around 150 m instead of 50 m. Apart from that combination of baseline and velocity, the update rates of the other combinations correspond to the target values and the real-time capability is given.

A robust cloud position estimation is based on a high number of 2D cloud features that are ideally evenly distributed across the segmented clouds. Figure 8 depicts the number of detected and tracked features as well as the feature density in features per segmented cloud pixel over the entire cloud approach scenario. Both metrics are illustrated for a velocity of 70 kt and 250 kt, showing curves for triangulation baseline configurations of 50 m (blue) and 200 m (red). At the beginning, there are still few parallel feature tracks, but the more flight distance is covered, the more feature tracks are generated, which is why the number and density of features increases quite constantly. In the initial phase, it is clearly evident that both parameters increase more rapidly with a baseline of 50 m due to the higher update rate (blue curves). Apart from that, the curves for different speeds and baselines show similarities. In principle, a variety of 2D cloud features can be detected and tracked, which is also reflected in the feature density, but feature duplicates across parallel feature tracks are not excluded. After about 170 s at 70 kt and 44 s at 250 kt, parts of the larger cloud begin to move out of the sensor’s FOV (dashed line). Consequently, the number of 2D features decreases. As the segmented areas also become smaller, the feature density remains unaffected first. However, in the period in which the last segmented cloud areas leave the sensor section, the feature densities show large deflections.

Other metrics that provide information on the success of feature detection, tracking and triangulation are the feature losses caused by the individual filter stages of the proposed methodology. High feature losses indicate unrobustness in the process. Figure 9 illustrates the progression of feature losses over time at a baseline of 200 m. The loss curves for 50 m baseline show a very similar behavior, which is why they are not illustrated additionally. Up to the point at which the first cloud areas move out of the FOV, the total feature losses (blue) are low, whereby a few more feature losses occur at the cloud approach speed of 70 kt. After that event, most contributions come from untrackable features (red). Segmented areas then also disappear, resulting in losses due to the contour filter (yellow), as features can no longer be assigned to a segmented contour. Sporadically, features are filtered out by the reprojection error threshold (orange) and implausible (outside FOV, great distant) locations (magenta). Only a few features are eliminated by applying the RANSAC filter based on the homography between the sample frames (green). Individual outliers did not have to be removed by DBSCAN clustering (cyan). Finally, if no more clouds are segmented at all, 100% of the features are lost consequently.

The cloud position estimate is evaluated by comparing the cloud occupancy grid with the cloud ground truth volumes. If all or most of the cloud-covered grid cells lie within the CGTV, while only a few are outside, the cloud position estimation is considered reliable. However, many cells located outside the CGTV indicate unreliable results. Figure 10 compares the determined occupied cloud grid cells inside the CGTV (continous curves) with those outside the CGTV (dashed curves) for the two baseline and speed configurations. The distance of the UAV to the cloud centers is represented on the x-axis, while the y-axis represents the number of grid cells declared as occupied.

At the beginning, it becomes apparent again that the 200 m baseline configuration (red) has slightly delayed increasing cloud-covered cells compared to the 50 m baseline configuration due to the lower update rate. The cloud approach at 70 kt (Figure 10a) shows more outliers than inliers beyond distances of 11 km for a 50 m baseline, whereas for a 200 m baseline, nearly equal numbers of inliers and outliers occur. Below 11 km, the 200 m baseline configuration consistently shows fewer outliers than inliers, with hardly any outliers below 7 km and almost none below 5 km. The amount of inliers for the 50 m baseline remains higher than that for the 200 m baseline throughout almost the entire observation period. However, the number of outliers is also higher, with a noticeable increase occurring below 5 km distance to the cloud centers. In the context of the analysis at 250 kt, some differences become apparent (Figure 10b). The curves of the inliers and outliers for both base configurations follow a similar trend. However, it is important to note that the actual baseline for the 50 m baseline configuration exceeds 50 m and varies due to limitations in computing resources. Except for a brief moment, the 200 m baseline consistently has fewer outliers than the 50 m baseline. For long distances (>12 km), the number of outliers is about half as high as the number of inliers for both configurations. Beyond this, the outliers steadily decrease, with almost none observed below 7 km distance to the cloud centers. However, the 50 m baseline curve shows a striking rise in outliers below approximately 5 km.

Based on the above observations, the following consequences can be derived for the cloud approach under investigation:

A larger triangulation baseline leads to more accurate cloud position estimates;
A higher speed has a positive effect on the approach;
Depending on the computational resources of the hardware used, the real-time capacity is no longer guaranteed if the baseline is too small in combination with a high velocity;
In line with expectations, the accuracy of cloud position estimation increases with decreasing distance to the clouds.

In addition to the ratio of cells inside and outside the CGTV, it is also relevant how the outlier cells are distributed and how far they are located from the nearest CGTV. Figure 11 visualizes snapshots of the 3D occupancy grid at 12 km (Figure 11a,c) and 8 km (Figure 11b,d) distance from the cloud centers for the 200 m baseline configuration. The cloud-occupied cells are marked in green, while the CGTVs are marked in red. Remarkable is the scattering of the outlier grid cells for the greater distance in front of and behind the CGTVs along the optical axis, which indicates inaccurate triangulations.

Figure 12 illustrates the distance of the outliers to the nearest CGTV for all combinations examined over the observation period. The position estimates at 70 kt speed and 50 m baseline are the most error-prone (Figure 12a). Here, the errors are often over 2 km, and sometimes even over 5 km, at distances over 6 km to the cloud centers. The largest proportion of deviations above a distance of 8 km from the cloud centers are in the ranges 251–500 m and 501–1000 m for the other combinations (Figure 12b–d).

3.2.2. Flight Experiment Scenario

In the following, initial cloud position estimates from a flight experiment are presented using the previously described research platform (Section 2.4.3). Compared to simulation investigations, obtaining cloud ground truth data is more difficult during flight tests. An option to carry this out is by using satellite images, whereby flight test and the recording time of the satellite image must be coordinated [32]. Another more practical option is to log the GNSS position of the flight test carrier when reaching and leaving the cloud base. However, logged positions can only function as reference points; the exact cloud dimension and the cloud ground truth volume are not available. Table 6 summarizes the fundamental configuration parameters of the investigated cloud approach.

A sensor recording of the frontal cloud approach is visualized in Figure 13. During the experiment, four positions were logged as ground truth, two each for the front left cloud and for the cloud in the right half of the image. Thus, there is no information about the vertical extent of the clouds or the horizontal extent orthogonal to the flight path. The cloud mask determined by the segmentation model is visualized in orange and features that remain after all filter stages were applied and are shown in red (Figure 13).

With an average ground speed of 58 knots and a baseline of 200 m between sample frames, the cloud occupancy grid is updated approximately every 7 s. The distribution of the 2D features across the segmented image areas is highly uneven, so strong concentrations are present in certain cloud areas, while other cloud regions lack persistent detections (Figure 13). Over the course of the observation period, the number and density of features decreases until all features are eliminated by the filter stages (Figure 14). This is due to significant feature losses, which result mainly from the constantly high level of homography outliers and from accumulating reprojection errors over time (Figure 15). To avoid losing all features after a few triangulation steps, the thresholds of the homography RANSAC filter were increased from 10 to 20 pixels and those of the triangulation reprojection error filter from 10 to 100 pixels. Both the change in cloud structure and the wind-induced displacement between the sample frames can increase feature losses. Moreover, the error-prone measurement of the sensor pose, provided by the INS, contributes to higher reprojection errors as well as errors in the intrinsic and distortion parameters derived from sensor calibration.

Since only four marked cloud positions were available for validation, a comprehensive evaluation of the 3D occupancy grid as in the simulation case (Section 3.2.1) was not conducted. Nevertheless, the cloud occupancy grid can be compared with the logged cloud positions. Figure 16 displays snapshots of the cloud occupancy grid for two moments of the cloud approach. The logged cloud positions of the first cloud are shown in red and those of the second cloud in blue, while the cloud-occupied cells are shown in shades of green depending on the probability determined. Both clouds were passed at the cloud base, with the front cloud being passed at the right edge and the rear cloud at the left edge.

The first cloud-occupied cells of the rear cloud are displayed after the second measurement (Figure 16a,b). In the same area, constantly occupied cells are identified over the rest of the sequence. In the vicinity of the front-left cloud (between the red logged positions), occupied cells are displayed only in three out of ten measurements. With respect to the flown trajectory, the logged cloud positions and the cloud areas in the image, the position estimates are plausible, since the detections of the rear cloud occur to the righthand side and above the aircraft (Figure 16). Additionally, the cloud position estimates of the front cloud are located to the lefthand side, close to the logged ground truth positions (Figure 16c,d).

4. Discussion

The approach presented in Section 2.3, which involves cloud segmentation with subsequent monocular cloud position estimation using feature triangulation, was tested in simulation and in real flight experiments. A direct comparison between the results of the simulation and the flight tests is limited due to the deviating configurations. The intrinsic camera parameters differ as the simulation environment does not allow for Full HD resolution of the simulated sensor. Nevertheless, the results of both experimental setups show tendencies that are discussed in the following.

4.1. Segmentation Performance

The presented models for the purpose of cloud segmentation (Section 3.1.3) fulfill the task of reliably separating clouds from the background and thus forcing the feature detectors and trackers into potential cloud areas without generating frequent false segmentations on the ground. This conclusion is supported by common segmentation evaluation metrics, such as precision. However, the developed segmentation models tend to partially fail to segment small or distant clouds, which is reflected in lower recall values. Clouds in the immediate vicinity of the UAV are of particular relevance for Sense and Avoid. With this in mind, the segmentation models presented meet the requirements for reliably segmenting clouds in the foreground.

4.2. Quantitative Analysis of Cloud Position Estimation in Simulation

In order to determine the quality of the cloud position estimation, feature-related parameters, such as the number of features, feature density per segmented pixel and feature losses per iteration step, were consulted. In addition, the grid cells of the 3D cloud occupancy grid that were identified as cloud-occupied were compared with the actual cloud positions. The simulated cloud approach scenario was investigated for two baselines (50 m and 200 m) at two constant velocities (70 kt and 250 kt). Key evaluation findings of the simulated cloud approach indicate that a combination of a high velocity and a large baseline allow for a more precise cloud position estimation. These findings are in line with theoretical considerations and expectations. First, larger baselines result in larger triangulation angles and, therefore, more robust triangulation results. Second, the change in cloud shape (voxel spin) or cloud position (wind shift) is smaller at higher velocities, as the time between frames is shorter with a constant baseline. However, at constant speed, larger baselines lead to lower update rates. In addition, the investigation highlighted that excessive speeds, combined with small baselines between sample frames, can compromise the real-time capability of the method, depending on the available computing resources.

The investigations in the simulation show that the requirement to provide a reliable cloud position estimate at 70 kt at least 2.5 km before reaching the cloud is met for the simulated cloud scenario. Eventually, with a baseline of 200 m at 70 kt below 7 km from the cloud centers, there occur only a few outliers and almost none are below 5 km (Figure 10a). At the investigated cloud approach speed of 250 kt and a 200 m baseline, there are also almost no outliers in the cloud occupancy grid below a distance of 7.5 km from the cloud centers (Figure 10b). Applying Equations (1) and (2) for 250 kt with the assumptions made in Section 2.1.1 results in a critical relative distance of 5171.89 m, at which the detection result must be available in sufficient quality at the latest. Accordingly, this requirement is also fulfilled with regard to the investigated scenario at 250 kt.

Apart from speed and baseline, various factors influence the success of the proposed methodology, for example, the cloud structure, the cloud position in the image and the sensor resolution, to mention just a few. Thus, a more comprehensive evaluation with several approaches and greater variation is necessary in order to make generally valid statements regarding the functional range.

In general, the proposed method for cloud position estimation shows limitations related to the operating principle. Since the triangulated cloud features are always located on the cloud surface in the part of the cloud visible from the UAV, the rear side of the cloud and the entire cloud dimension remain undetected. Even clouds that are shadowed by other clouds cannot be detected by the proposed approach. Moreover, clouds that are located near the focus of expansion are problematic, as in this image area the flow vectors between sample frames hardly change over time, which leads to inaccurate triangulations. Additionally, clouds with high homogeneity are disadvantageous, as feature detection and tracking become more complicated in such cases.

4.3. Findings for Cloud Position Estimation in Flight Experiments

On the contrary, the evaluation of the cloud approach from a flight test on 14 May 2024 reveals the challenges associated with electro-optical cloud detection under real flight conditions. Thereby, high feature losses occur and significantly fewer detected and tracked features result, which is reflected in low feature densities per segmented pixel. An analysis of the reprojection error reveals that it accumulates over time, so that all 2D features are eliminated after a certain time, depending on the threshold value set for the reprojection error filter. A major factor contributing to increasing reprojection errors is that, according to the current implementation, all poses used for triangulation refer to the sensor pose of the initial frame in the camera coordinate system. In contrast to simulation, the position of the sensor and the orientation of the gimbal and sensor pod are measured by an integrated INS, which includes measurement inaccuracies. In addition, there is the aspect of time synchronization between metadata and sensor recording. Although these data are output by the sensor unit together at 10 Hz, it cannot be ruled out that minor deviations may occur. Therefore, a future re-initialization of the reference camera pose when a new feature track is initialized could reduce the reprojection errors of the triangulation. Further causes for the observation of increased reprojection errors in flight experiments could lie in the intrinsic matrix and the distortion parameters. Although these distortion parameters and the intrinsic camera parameters were determined by a sensor calibration, this calibration is also subjected to inaccuracies. Another challenge is related to the update rate resulting from the speed of the flight test carrier and the triangulation baseline. The resulting update rate of about 7 s enables changes in cloud appearance and a wind-induced displacement of cloud position between the sample frames, which hampers feature tracking and negatively affects triangulation. Other challenges faced by the proposed methodology in real flight conditions include the complex dynamics of clouds and the countless variations in cloud formations. In addition, environmental factors such as lens reflections, overexposure and blooming can limit detection performance. However, even if the feature losses are high, cloud-occupied grid cells could be determined whose estimated positions are plausible when compared with the logged cloud ground truth positions in the flight test.

4.4. Conclusion and Future Work

In summary, the functionality of the proposed approach could be demonstrated both through simulation and in flight tests, even if the latter, as expected, poses particular challenges due to environmental influences. Apart from a more comprehensive evaluation of further cloud approaches in simulation and flight experiments, the focus of current developments is on optimizing reprojection errors on the one hand and reducing feature losses on the other. Future research activities in the field of electro-optical cloud detection at the Institute of Flight Systems at UniBwM will focus on the analysis of environmental influences, which is why the influence of different solar incidence angles on feature detection, tracking and triangulation is currently being investigated. In addition, the integration of external weather information is conceivable in order to be able to estimate the position of moving clouds. In this context, knowledge of the wind direction and strength at flight altitude is particularly relevant. Finally, a transfer of the methodology to infrared sensor recordings is envisioned in order to be able to guarantee functionality even under difficult lighting conditions.

Author Contributions

Conceptualization, A.D.; methodology, A.D.; software, A.D.; validation, A.D.; formal analysis, A.D.; investigation, A.D.; resources, P.S.; data curation, A.D.; writing—original draft preparation, A.D.; writing—review and editing, A.D. and P.S.; visualization, A.D.; supervision, A.D. and P.S.; project administration, A.D. and P.S.; funding acquisition, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by dtec.bw—Digitalization and Technology Research Center of the Bundeswehr—which we gratefully acknowledge. dtec.bw is funded by the European Union—NextGenerationEU. The APC was funded by the University of the Bundeswehr Munich (UniBwM).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

MissionLab Airborne Dataset—Clouds (MLAD-C) is publicly available on Zenodo at https://zenodo.org/records/14267123 (DOI: 10.5281/zenodo.14267123). Additionally, MLAD-C can be accessed via GitHub at https://github.com/Adrian-UniBwM/MLAD-C (both accessed on 4 December 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application Programming Interface
CGTV	Cloud Ground Truth Volume
CGTM	Cloud Ground Truth Mask
CPU	Central Processing Unit
ECEF	Earth-Centered-Earth-Fixed
EO	Electro-Optical
FOV	Field of View
GNSS	Global Navigation Satellite System
GPU	Graphics Processing Unit
INS	Inertial Navigation System
IoU	Intersection over Union
ISR	Intelligence, Surveillance, Reconnaissance
mAP	Mean Average Precision
MLAD-C	MissionLabAirborneDataset-Clouds
MLSD-C	MissionLabSimulationDataset-Clouds
MWIR	Mid-Wave Infrared
NED	North-East-Down
RAM	Random Access Memory
RANSAC	Random Sample Consensus
ROS2	Robot Operating System 2
SAA	Sense and Avoid
SDK	Software Development Kit
SWaP-C	Size, Weight and Power-Cost
UAV	Unmanned Aerial Vehicle
UniBwM	University of the Bundeswehr Munich
VLA	Very Light Aircraft
VFR	Visual Flight Rule

References

Dauer, J.C. (Ed.) Automated Low-Altitude Air Delivery: Towards Autonomous Cargo Transportation with Drones; Research Topics in Aerospace; Springer International Publishing: Cham, Switzerland, 2022. [Google Scholar] [CrossRef]
Lyu, M.; Zhao, Y.; Huang, C.; Huang, H. Unmanned Aerial Vehicles for Search and Rescue: A Survey. Remote Sens. 2023, 15, 3266. [Google Scholar] [CrossRef]
Liu, C.A.; Dong, R.; Wu, H.; Yang, G.T.; Lin, W. A 3D Laboratory Test-Platform for Overhead Power Line Inspection. Int. J. Adv. Robot. Syst. 2016, 13, 72. [Google Scholar] [CrossRef]
Gillins, M.N.; Gillins, D.T.; Parrish, C. Cost-Effective Bridge Safety Inspections Using Unmanned Aircraft Systems (UAS). In Proceedings of the Geotechnical and Structural Engineering Congress 2016, Phoenix, AZ, USA, 14–17 February 2016; pp. 1931–1940. [Google Scholar] [CrossRef]
Máthé, K.; Buşoniu, L. Vision and Control for UAVs: A Survey of General Methods and of Inexpensive Platforms for Infrastructure Inspection. Sensors 2015, 15, 14887–14916. [Google Scholar] [CrossRef] [PubMed]
Department of Defense. Unmanned Aircraft Systems Roadmap 2005–2030; Technical Report; Department of Defense: Arlington County, VA, USA, 2005.
Internation Civil Aviation Organization. ANNEX 2 to the Convention on International Civil Aviation—Rules of the Air. In The Convention on International Civil Aviation—Annexes 1 to 18; International Civil Aviation Organization: Montreal, QC, Canada, 2018. [Google Scholar]
Dev, S.; Nautiyal, A.; Lee, Y.H.; Winkler, S. CloudSegNet: A Deep Network for Nychthemeron Cloud Image Segmentation. IEEE Geosci. Remote. Sens. Lett. 2019, 16, 1814–1818. [Google Scholar] [CrossRef]
Mohajerani, S.; Saeedi, P. Cloud-Net: An End-to-End Cloud Detection Algorithm for Landsat 8 Imagery. In Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 1029–1032. [Google Scholar] [CrossRef]
Mohajerani, S.; Krammer, T.A.; Saeedi, P. A Cloud Detection Algorithm for Remote Sensing Images Using Fully Convolutional Neural Networks. In Proceedings of the 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), Vancouver, BC, Canada, 29–31 August 2018; pp. 1–5. [Google Scholar] [CrossRef]
Funk, F.; Stuetz, P. A Passive Cloud Detection System for UAV: System Functions and Validation. In Proceedings of the AIAA Scitech 2019 Forum, San Diego, CA, USA, 7–11 January 2019. [Google Scholar] [CrossRef]
Nguyen, H.; Yadegar, J.; Utt, J.; Schwartz, B.; Ramu, P.; Ganguli, A.; Porway, J. EO/IR Due Regard Capability for UAS Based on Intelligent Cloud Detection and Avoidance. In Proceedings of the AIAA Infotech@Aerospace 2010, Atlanta, Georgia, 20–22 April 2010. [Google Scholar] [CrossRef]
Koehler, T.L.; Johnson, R.W.; Shields, J. Status of the Whole Sky Imager Database. In Proceedings of the Cloud Impacts on DOD Operations and Systems, 1991 Conference, El Segundo, CA, USA, 9–12 July 1991; pp. 77–80. [Google Scholar]
Long, C.N.; Sabburg, J.M.; Calbó, J.; Pagès, D. Retrieving Cloud Characteristics from Ground-Based Daytime Color All-Sky Images. J. Atmos. Ocean. Technol. 2006, 23, 633–652. [Google Scholar] [CrossRef]
Li, Q.; Lu, W.; Yang, J. A Hybrid Thresholding Algorithm for Cloud Detection on Ground-Based Color Images. J. Atmos. Ocean. Technol. 2011, 28, 1286–1296. [Google Scholar] [CrossRef]
Zhang, Q.; Xiao, C. Cloud Detection of RGB Color Aerial Photographs by Progressive Refinement Scheme. IEEE Trans. Geosci. Remote. Sens. 2014, 52, 7264–7275. [Google Scholar] [CrossRef]
Yi, W.; Jing, Z.; Shuang, G. Hue–Saturation–Intensity and Texture Feature-Based Cloud Detection Algorithm for Unmanned Aerial Vehicle Images. Int. J. Adv. Robot. Syst. 2020, 17, 172988142090353. [Google Scholar] [CrossRef]
Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man, Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
Dev, S.; Lee, Y.H.; Winkler, S. Color-Based Segmentation of Sky/Cloud Images From Ground-Based Cameras. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2017, 10, 231–242. [Google Scholar] [CrossRef]
Dev, S.; Savoy, F.M.; Lee, Y.H.; Winkler, S. Nighttime Sky/Cloud Image Segmentation. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 345–349. [Google Scholar] [CrossRef]
Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef]
Funk, F. Passive Cloud Detection for High Altitude Pseudo-Satellites. Ph.D. Thesis, Bundeswehr Universität München, Neubiberg, Germany, 2020. [Google Scholar]
Changhui, Y.; Yuan, Y.; Minjing, M.; Menglu, Z. Cloud detection method based on feature extraction in remote sensing images. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2013, XL-2/W1, 173–177. [Google Scholar] [CrossRef]
Tulpan, D.; Bouchard, C.; Ellis, K.; Minwalla, C. Detection of Clouds in Sky/Cloud and Aerial Images Using Moment Based Texture Segmentation. In Proceedings of the 2017 International Conference on Unmanned Aircraft Systems (ICUAS), Miami, FL, USA, 13–16 June 2017; pp. 1124–1133. [Google Scholar] [CrossRef]
Shi, M.; Xie, F.; Zi, Y.; Yin, J. Cloud Detection of Remote Sensing Images by Deep Learning. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 701–704. [Google Scholar] [CrossRef]
Mohajerani, S.; Saeedi, P. Cloud-Net+: A Cloud Segmentation CNN for Landsat 8 Remote Sensing Imagery Optimized with Filtered Jaccard Loss Function. arXiv 2020, arXiv:2001.08768. [Google Scholar]
Kanu, S.; Khoja, R.; Lal, S.; Raghavendra, B.; Cs, A. CloudX-net: A Robust Encoder-Decoder Architecture for Cloud Detection from Satellite Remote Sensing Images. Remote. Sens. Appl. Soc. Environ. 2020, 20, 100417. [Google Scholar] [CrossRef]
Nied, J.; Jones, M.; Seaman, S.; Shingler, T.; Hair, J.; Cairns, B.; Gilst, D.V.; Bucholtz, A.; Schmidt, S.; Chellappan, S.; et al. A Cloud Detection Neural Network for Above-Aircraft Clouds Using Airborne Cameras. Front. Remote. Sens. 2023, 4, 1118745. [Google Scholar] [CrossRef]
Batista-Tomás, A.R.; Díaz, O.; Batista-Leyva, A.; Altshuler, E. Classification and Dynamics of Tropical Clouds by Their Fractal Dimension. Quaterly J. R. Meteorol. Soc. 2016, 142, 983–988. [Google Scholar] [CrossRef]
Dudek, A.; Funk, F.; Russ, M.; Stütz, P. Cloud Detection System for UAV Sense and Avoid: First Results of Cloud Segmentation in a Simulation Environment. In Proceedings of the 2019 IEEE 5th International Workshop on Metrology for AeroSpace (MetroAeroSpace), Turin, Italy, 19–21 June 2019; pp. 533–538. [Google Scholar] [CrossRef]
Dudek, A.; Stütz, P. Cloud Detection System for UAV Sense and Avoid: Cloud Distance Estimation Using Triangulation. In Proceedings of the 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC), San Antonio, TX, USA, 11–15 October 2020; pp. 1–5. [Google Scholar] [CrossRef]
Dudek, A.; Stütz, P. Cloud Detection System for UAV Sense and Avoid: Discussion of Suitable Algorithms. In Proceedings of the 2021 IEEE Aerospace Conference (50100), Big Sky, MT, USA, 6–13 March 2021; pp. 1–7. [Google Scholar] [CrossRef]
Dudek, A.; Kunstmann, F.; Stütz, P.; Hennig, J. Detect and Avoid of Weather Phenomena On-Board UAV: Increasing Detection Capabilities by Information Fusion. In Proceedings of the 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC), San Antonio, TX, USA, 3–7 October 2021; pp. 1–7. [Google Scholar] [CrossRef]
Bertoncini, J.; Dudek, A.; Russ, M.; Gerdts, M.; Stütz, P. Fixed-Wing UAV Path Planning and Collision Avoidance Using Nonlinear Model Predictive Control and Sensor-based Cloud Detection. In Proceedings of the 2023 IEEE/AIAA 42nd Digital Avionics Systems Conference (DASC), Barcelona, Spain, 1–5 October 2023; pp. 1–10. [Google Scholar] [CrossRef]
Dudek, A.; Behret, V.; Stütz, P. Cloud Detection System for UAV Sense and Avoid: Challenges and Findings in Flight Experiments. In Proceedings of the 2023 IEEE Aerospace Conference, Big Sky, MT, USA, 4–11 March 2023; pp. 1–11. [Google Scholar] [CrossRef]
Dudek, A.; Stütz, P. A Cloud Detection System for UAV Sense and Avoid: Flight Experiments to Analyze the Impact of Varying Environmental Conditions. In Proceedings of the AIAA SCITECH 2024 Forum, Orlando, FL, USA, 8–12 January 2024. [Google Scholar] [CrossRef]
Ostler, J.; Dudek, A.; Bertoncini, J.; Russ, M.; Stütz, P. MissionLab: A Next Generation Mission Technology Research Platform Based on a Very Light Aircraft. In Proceedings of the AIAA Scitech 2024 Forum, Orlando, FL, USA, 8–12 January 2024. [Google Scholar] [CrossRef]
Rheinmetall Technical Publications GmbH. Luna Ng: Airborne Reconnaissance System. 2022. Available online: https://www.rheinmetall.com/Rheinmetall%20Group/Karriere/Rheinmetall%20als%20Arbeitgeber/Menschen-Projekte/penzberg/B328e0522_RTP_LUNA_NG_A5_quer_ES_LR.pdf (accessed on 11 November 2024).
Pyka Inc. Pelican Cargo. 2024. Available online: https://www.flypyka.com/pelican-cargo (accessed on 5 December 2024).
Hozumi, K.; Harimaya, T.; Magono, C. The Size Distribution of Cumulus Clouds as a Function of Cloud Amount. J. Meteorol. Soc. Jpn. Ser. II 1982, 60, 691–699. [Google Scholar] [CrossRef]
Lin, Z.; Castano, L.; Mortimer, E.; Xu, H. Fast 3D Collision Avoidance Algorithm for Fixed Wing UAS. J. Intell. Robot. Syst. 2020, 97, 577–604. [Google Scholar] [CrossRef]
World Meteorological Organization. International Cloud Atlas; World Meteorological Organization: Geneva, Switzerland, 1956; Volume 1. [Google Scholar]
Jocher, G.; Chaurasia, A.; Qiu, J. YOLO by Ultralytics; Ultralytics: Frederick, MD, USA, 2023; Available online: https://github.com/ultralytics/ultralytics (accessed on 2 January 2025).
Ultralytics Inc. YOLOv8—Ultralytics Yolo Docs. 2024. Available online: https://docs.ultralytics.com/models/yolov8/ (accessed on 2 January 2025).
Funk, F.; Stütz, P. A Passive Cloud Detection System for UAV: Concept and First Results. In Proceedings of the International Symposium on Enhanced Solutions for Aircraft and Vehicle Surveillance Applications (ESAVS), Berlin, Germany, 7–8 April 2016. [Google Scholar]
Bradski, G. The Opencv Library. Dobb’s J. Softw. Tools 2000, 25, 120–125. [Google Scholar]
Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231. [Google Scholar]
Hunter, J.D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
Hornung, A.; Wurm, K.M.; Bennewitz, M.; Stachniss, C.; Burgard, W. OctoMap: An Efficient Probabilistic 3D Mapping Framework Based on Octrees. Auton. Robot. 2013, 34, 189–206. [Google Scholar] [CrossRef]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. arXiv 2023. [Google Scholar] [CrossRef]
Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.; Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and Flexible Image Augmentations. Information 2020, 11, 125. [Google Scholar] [CrossRef]
Powers, D. Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar] [CrossRef]
International Civil Aviation Organization. Annex 11 to the Convention on International Civil Aviation—Air Traffic Services. In The Convention on International Civil Aviation—Annexes 1 to 18, 15th ed.; International Civil Aviation Organization: Montreal, QC, Canada, 2018; p. APP 4–1. [Google Scholar]

Figure 1. Core elements of cloud detection workflow. Processing steps are executed repeatedly in clockwise direction.

Figure 2. Benefit of cloud segmentation. Different cloud scenes captured during flight experiments on 12 October 2023 and 14 May 2024 in the area of Upper Bavaria of Germany showing detected features without (a–c) and with (d–f) prior cloud segmentation. Predicted cloud masks are shown as orange contour (d–f). ORB features are marked in red and Shi–Tomasi features are drawn in green.

Figure 3. Two-dimensional (a) and three-dimensional (b) occupancy grids. (a) shows occluded and out-of-FOV cells (orange), cloud-occupied cells (blue) and cloud-free cells (white), while (b) shows cells with

p_{updated} (n, e, d) > 0.5

in green and the true cloud position with its dimension in red. In addition, the flight test carrier, described in Section 2.4.3, is represented as a 3D model in order to visualize the pose (b).

Figure 3. Two-dimensional (a) and three-dimensional (b) occupancy grids. (a) shows occluded and out-of-FOV cells (orange), cloud-occupied cells (blue) and cloud-free cells (white), while (b) shows cells with

p_{updated} (n, e, d) > 0.5

in green and the true cloud position with its dimension in red. In addition, the flight test carrier, described in Section 2.4.3, is represented as a 3D model in order to visualize the pose (b).

Figure 4. Zlin Savage VLA research platform with pods underneath the wings (a). Additionally shown is the sensor pod with the gimbal reconnaissance sensor during a flight test (b).

Figure 5. Simulated cloud scenes (a–c) and corresponding segmentation masks below (d–f).

Figure 6. Flight recordings covering different cloud scenes from experiments on 14 May 2024 (a), 12 October 2023 (b) and 17 July 2024 (c) with the corresponding cloud mask predictions below (d–f).

Figure 7. Simulated cloud approach scenario with segmented contours (orange) and detected and tracked cloud features (red).

Figure 8. Feature amount (a,b) and feature density (c,d) during simulated cloud approaches. Cloud approach speed is constant at 70 knots (a,c) and 250 knots (b,d). Blue curves show 50 m baseline configuration and red curves show 200 m baseline configuration.

Figure 9. Total feature losses (blue) and contributions from separate filter stages (other colors) during cloud approaches at constant speeds of 70 knots (a) and 250 knots (b), with a baseline of 200 m between sample frames.

Figure 10. Comparison between cloud-occupied grid cells inside (continous lines) and outside (dashed lines) the CGTV for 50 m baseline (blue) and for 200 m baseline (red) at constant speeds of 70 kt (a) and 250 kt (b).

Figure 11. RViz display showing 3D occupancy grid cell distribution for cloud approaches at 70 kt (a,b) and 250 kt (c,d). Snapshots of cloud occupancy are visualized at 12 km (a,c) and 8 km (b,d) distance between UAVs and cloud centers with a triangulation baseline of 200 m. Cloud-occupied cells are marked in green and the cloud ground truth volumes are drawn in red.

Figure 12. Total number of cloud-occupied grid cells outside of CGTV (dashed blue curves) and number of outside cells within certain CGTV vicinity ranges dependant on the color. Displayed are speed–baseline combinations of 70 kt–50 m (a), 250 kt–50 m (b), 70 kt–200 m (c) and 250 kt–200 m (d).

Figure 13. Approached cloud formation with segmented cloud areas (orange) and detected and tracked features (red). Flight test was conducted on 14 May 2024 in the Upper Bavaria region of Germany.

Figure 14. Number of detected and tracked features (a) and feature density (b) during the cloud approach.

Figure 15. Total feature losses (blue) and parts of the losses of the separate filter stages (other colors) during cloud approach.

Figure 16. RViz snapshots showing above view and 3rd-person view of 3D cloud occupancy grid after 7.2 s (a,b) and 21.35 s (c,d). Cloud log positions are illustrated for the front-left cloud (red) and rear-right cloud (blue) from Figure 13. Cells with an occupancy probability of over 50% are marked in green, with lighter shades indicating a higher cloud probability.

Table 1. Summaries of the requirements for the cloud detection system.

Attributes	Requirements
Minimum range for reliable estimates	2.5 km at 70 knots
Sensor type	EO, IR expandable
System design	Monocular, platform-independent
System output	Provision of real-time 3D cloud situation

Table 2. Selection of cloud properties relevant for electro-optical cloud detection.

Clouds Have …	Clouds Move …
achromatic appearance [19]	correlated within cloud layer
rather even scattering of red and blue light [14]	uncorrelated between layers (windshear)
high intensity due to large reflectivity [16]	vertically (in case of convective clouds)
fewer details compared to ground [16]	with negligible acceleration
high homogeneity in cloud middle [12]

Table 3. Overview of evaluation concept showing analytical focus and evaluation metrics.

Analytical Focus	Validation of …	Evaluation Metric
Cloud Segmentation	Model Prediction vs. CGTM	Precision Recall mAP
Cloud Position Estimation	Feature Detection, Tracking and Triangulation	Feature Amount Feature Density Feature Losses
Cloud Position Estimation	Position Accuracy	Inside vs. Outside CGTV Outlier Distances to CGTV

Table 4. Evaluation metrics of the developed cloud segmentation models based on YOLOv8l-seg.

Dataset	Precision	Recall	mAP50	mAP50-95
MLSD-C	0.811	0.446	0.526	0.334
MLAD-C	0.889	0.547	0.608	0.457

Table 5. Configuration parameters of simulated cloud approach scenario.

Frame Rate	Resolution	FOV	Baseline Sample Frames	Ground Speed
10 Hz	800 × 600	60° × 45°	50 m, 200 m	70 kt, 250 kt

Table 6. Configuration parameters of cloud approach scenario during flight test on 14 May 2024.

Frame Rate	Resolution	FOV	Baseline Sample Frames	Ground Speed
10 Hz	1920 × 1080	37.6° × 21.8°	200 m	58 kt

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dudek, A.; Stütz, P. A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests. Drones 2025, 9, 55. https://doi.org/10.3390/drones9010055

AMA Style

Dudek A, Stütz P. A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests. Drones. 2025; 9(1):55. https://doi.org/10.3390/drones9010055

Chicago/Turabian Style

Dudek, Adrian, and Peter Stütz. 2025. "A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests" Drones 9, no. 1: 55. https://doi.org/10.3390/drones9010055

APA Style

Dudek, A., & Stütz, P. (2025). A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests. Drones, 9(1), 55. https://doi.org/10.3390/drones9010055

Article Menu

A Cloud Detection System for UAV Sense and Avoid: Analysis of a Monocular Approach in Simulation and Flight Tests

Abstract

1. Introduction

1.1. Related Work

1.1.1. Cloud Segmentation

1.1.2. Cloud Position Estimation

1.1.3. Cloud Type Classification

1.1.4. Previous Work at University of the Bundeswehr Munich

1.2. Outline

2. Materials and Methods

2.1. Requirements

2.1.1. Performance Requirements

2.1.2. System Requirements

2.2. Meteorological Aspects

2.3. Proposed Approach

2.3.1. Image and Metadata Acquisition

2.3.2. Cloud Segmentation

2.3.3. Ensemble Feature Detection and Tracking

2.3.4. Two-Dimensional Feature Filtering in Image Plane

2.3.5. Triangulation

2.3.6. Plausibility Check

2.3.7. Cluster Analysis

2.3.8. Cloud Occupancy Grid

2.4. Experimental Design

2.4.1. Evaluation Concept

2.4.2. Experimental Setup: Simulation Environment

2.4.3. Experimental Setup: Flight Test Bed

3. Results

3.1. Cloud Segmentation

3.1.1. MissionLabSimulationDataset-Clouds (MLSD-C)

3.1.2. MissionLabAirborneDataset-Clouds (MLAD-C)

3.1.3. Segmentation Results

3.2. Cloud Position Estimation

3.2.1. Simulated Scenario

3.2.2. Flight Experiment Scenario

4. Discussion

4.1. Segmentation Performance

4.2. Quantitative Analysis of Cloud Position Estimation in Simulation

4.3. Findings for Cloud Position Estimation in Flight Experiments

4.4. Conclusion and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI