Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models

Emadi, Seyyedbehrad; Limongiello, Marco

doi:10.3390/electronics14020399

Open AccessArticle

Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models

by

Seyyedbehrad Emadi

^*

and

Marco Limongiello

DICIV, Department of Civil Engineering, University of Salerno, 84084 Fisciano, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(2), 399; https://doi.org/10.3390/electronics14020399

Submission received: 15 December 2024 / Revised: 16 January 2025 / Accepted: 17 January 2025 / Published: 20 January 2025

(This article belongs to the Special Issue Point Cloud Data Processing and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Noise in 3D photogrammetric point clouds—both close-range and UAV-generated—poses a significant challenge to the accuracy and usability of digital models. This study presents a novel deep learning-based approach to improve the quality of point clouds by addressing this issue. We propose a two-step methodology: first, a variational autoencoder reduces features, followed by clustering models to assess and mitigate noise in the point clouds. This study evaluates four clustering methods—k-means, agglomerative clustering, Spectral clustering, and Gaussian mixture model—based on photogrammetric parameters, reprojection error, projection accuracy, angles of intersection, distance, and the number of cameras used in tie point calculations. The approach is validated using point cloud data from the Temple of Neptune in Paestum, Italy. The results show that the proposed method significantly improves 3D reconstruction quality, with k-means outperforming other clustering techniques based on three evaluation metrics. This method offers superior versatility and performance compared to traditional and machine learning techniques, demonstrating its potential to enhance UAV-based surveying and inspection practices.

Keywords:

artificial intelligence; structure from motion; accuracy enhancement; machine learning; 3D digital survey

1. Introduction

Advancements in low-altitude remote sensing and image analysis techniques have revolutionized the digitizing of real-world objects, initially represented by point clouds [1]. Over the past decade, there has been a notable increase in the number of studies examining the utilization of unmanned aerial vehicle (UAV) image technology for surveying and inspection, which has been extensively documented in the literature [2,3,4,5]. UAV photogrammetry exhibits immense potential for built environment inspections and surveys thanks to multisource data acquisition, efficient data collection, rapid observation, relatively low costs, and multidimensional data representation. However, a major challenge lies in the noise introduced during data capture and 3D reconstruction [6]. The transformation of a noisy point cloud into its unknown noise-free state is an inherently ill-posed problem. This noise significantly affects the accuracy and usability of UAV images, hindering their effectiveness in real-world applications. Over the past decade, the use of photogrammetry for digital 3D recording has expanded significantly. Advances in computer vision and modern computing technologies have addressed photogrammetry’s long-standing limitations by accelerating processing times and enabling automation. The adoption of automatic structure from motion (SfM) technology has gradually shifted the focus from using laser scanner technology for 3D measurement in scientific applications to a growing reliance on photogrammetry. Despite numerous research efforts, point cloud denoising remains challenging [7,8,9]. With the integration of computer vision techniques, point cloud processing has become faster and more efficient, addressing many limitations of traditional photogrammetry.

Through the application of computer vision techniques, point cloud technology for 3D recording has made significant advancements and has become a key tool in surveying and structural monitoring. Various applications are highlighted in the literature, such as structural monitoring of historical buildings, generating 3D models for volume calculations, and creating metric maps for use in mining estimation [10,11,12,13,14]. Denoising point clouds is a crucial step in many applications like object recognition and autonomous navigation. While significant progress has been made in utilizing artificial intelligence for these tasks, challenges remain. Several studies have explored point cloud denoising, employing advanced computer vision techniques and deep learning architectures. For instance, Bai et al. [15] introduced SM-HFEGCN, a graph convolutional network designed to enhance point cloud understanding by incorporating scale measurement and high-frequency enhancement. While their approach effectively captures local geometric relationships and addresses limitations in representing the overall spatial scale of local graphs, it is primarily focused on point cloud classification and segmentation tasks. The method emphasizes the integration of spatial scale features and high-frequency information to capture node variations, which improves the representation of differences and similarities between nodes. However, despite its contributions, SM-HFEGCN does not directly address the challenge of noise reduction in point cloud data, particularly in the context of enhancing 3D reconstruction. Wu et al. [16] proposed the Plant-Denoising-Net, a deep learning-based approach designed to address the specific challenges of plant point clouds, such as uneven density, incompleteness, and diverse noise types. Plant-Denoising-Net utilizes a density gradient learning approach and incorporates three key modules: the Point Density Feature extraction module, the Umbrella Operator Feature computation module, and the density gradient estimation module. While Plant-Denoising-Net achieves state-of-the-art performance in denoising plant point clouds, with improvements of 7.6–19.3% under Gaussian noise and notable computational efficiency, its application is tailored to plant-phenotyping scenarios. Consequently, its generalizability to other domains, such as build environment point clouds or UAV-based 3D reconstructions, remains unproven and unlikely. This highlights the need for approaches capable of addressing noise in more diverse and geometrically complex datasets. Sohail et al. [17] reviewed the application of deep transfer learning and domain adaptation in addressing these issues, particularly for tasks such as denoising, object detection, semantic labeling, and classification. While these approaches have effectively mitigated noise and enhanced point cloud data quality, they often rely on pre-trained models and fine-tuning strategies that do not generalize to complex or large-scale datasets. Moreover, their performance can degrade in scenarios with partial overlap or outliers, as seen in sensor-acquired point clouds. Although combining their method with traditional machine learning methods has shown promise in addressing these limitations, existing frameworks still struggle with computational inefficiency and inconsistent results in complex applications. These challenges underscore the need for more robust and scalable solutions to improve point cloud quality, particularly in geometrically complex and noisy datasets like those encountered in cultural heritage preservation. Zhang et al. [18] conducted a comprehensive survey of point cloud completion methods, categorizing them into four primary approaches: point-based, convolution-based, GAN-based, and geometry-based methods. While these techniques have significantly improved with advancements in deep learning, challenges remain in enhancing their robustness, computational efficiency, and ability to capture intricate geometric details. This study highlighted the current methods’ limitations, such as noise sensitivity and high computational complexity, that hinder their effectiveness in practical applications. Despite these advancements, existing approaches often fall short in addressing complex scenarios, necessitating further exploration of novel architectures and techniques to better meet real-world demands. These limitations emphasize the importance of developing more accurate and efficient point cloud completion methods, particularly in domains requiring precise geometric reconstructions. Zhu et al. [19] conducted the first comprehensive survey of point cloud data augmentation methods, categorizing them into a taxonomy framework comprising basic and specialized approaches. These methods are essential for addressing challenges such as overfitting and limited diversity in training datasets, which are common in point cloud processing tasks. Despite their wide application, the study identified several limitations, including the lack of standardization in augmentation techniques and their varying effectiveness across different tasks. The research highlights the importance of selecting appropriate augmentation methods tailored to specific applications and suggests future directions to improve their robustness and scalability. These findings underscore the necessity of advancing augmentation techniques to support the growing demands of deep learning in point cloud analysis.

It is important to note that the accuracy required for data collection and processing in photogrammetry depends significantly on the intended purpose. For instance, when generating 3D models for applications such as augmented reality or basic web visualization in non-scientific contexts, achieving high levels of accuracy may not be essential. However, for applications where precise data are critical, such as condition assessment or structural analysis, optimizing the dataset through advanced processing techniques, including 3D mesh decimation, becomes a necessary step to ensure reliability. In the field of cultural heritage (CH), photogrammetry has a wide range of applications [20,21,22]. Its speed of acquisition and the portability of the equipment make it highly versatile technology, suitable for various uses. For the condition assessment of CH, it is essential to accurately compare the current state of a structure with its previous condition. Since revisiting and surveying CH as it existed in the past is impossible, reducing noise to generate the most accurate 3D model from available periodic survey data becomes essential. In cases where damage is identified, an accurate model of the structure’s past state, with minimal noise, is crucial for understanding the extent of the damage, its severity, and the rate of progression. This highlights the importance of improving the accuracy of available point cloud data for CH [23,24,25].

The accuracy of the model is influenced by specific photogrammetric constraints. One of the most significant factors impacting output accuracy in several studies is the angle formed between homologous rays captured by different cameras [26,27,28]. In general, a larger angle (within a certain range) results in higher achievable accuracy. Kraus’s research demonstrates a direct proportional relationship between the Base/Height ratio and accuracy [29]. While numerous studies have investigated models to improve point cloud accuracy [30,31], they often overlook the specific challenges of condition assessment. These studies primarily focus on optimizing the ideal datasets for accurately and efficiently reconstructing 3D models, without accounting for the practical limitations of condition assessment. In such scenarios, having the most accurate datasets takes precedence, even if creating a precise 3D model with the available data is not feasible. This paper aims to fill this gap by introducing a novel approach based on deep learning clustering models to optimize various SFM parameters, enhancing the accuracy of 3D reconstruction specifically for applications in CH 3D reconstruction and monitoring. Unlike traditional methods that focus on a single accuracy-related parameter, this approach simultaneously considers several calculated parameters within the latent space of a variational autoencoder model. This enables minimizing the influence of outlier data or noise while uncovering the most significant patterns and structures in the data. Noise reduction is the process of eliminating random variations or irrelevant data points that do not contribute to the accurate representation of the object or scene in the data. In this approach, several AI models, which are typically used for outlier detection, are specifically employed to identify data points that deviate significantly from the general pattern or distribution of the dataset, thereby reducing noise.

To do so, first, different accuracy-related parameters are analyzed separately to demonstrate that relying on a single parameter is insufficient. Then, the proposed methodology, which applies four different clustering models into the latent layer of a variational autoencoder (VAE), is implemented to enhance the accuracy of point cloud data and study the most powerful clustering algorithm for accuracy enhancement. A case study is used to showcase the robustness of the new method.

The methodology presented in this study, combining VAE with clustering algorithms for improving the accuracy of point cloud data, has broad applicability across various fields. Accurate point clouds improve the accuracy of existing digital models of historical structures or infrastructures such as bridges, aiding in structural integrity assessments, conservation planning, and restoration efforts. More precise models guide restoration work, ensuring that interventions align with historical accuracy and preserve the integrity of built environments [32,33,34]. Enhanced point clouds are a pivotal tool in geotechnical engineering and environmental monitoring, facilitating the analysis of slope stability, landslides, and other geological phenomena. Their application extends to tracking environmental changes, such as forest canopy dynamics and shoreline erosion. The increased precision in terrain and environmental modeling enhances safety protocols, supports the development of preventive measures, and aids in the sustainable management of natural resources and climate change mitigation efforts [35,36,37,38]. In disaster management and recovery, enhanced point clouds enable high-resolution damage assessments of infrastructure, including buildings and transportation networks, post-natural disasters. These assessments allow for the efficient prioritization of recovery operations and resource allocation, significantly reducing the time required for disaster response and rehabilitation planning [39]. Enhanced point clouds are integral to object detection, environmental mapping, and navigation systems. They provide the high-fidelity spatial data necessary to improve situational awareness, reliability, and the overall safety of autonomous systems, ensuring optimal performance under real-world conditions [40]. For 3D printing and additive manufacturing, enhanced point clouds provide the detailed geometric data required to fabricate electronic components such as antennas, sensors, and circuit boards. Their higher accuracy ensures that printed components adhere to precise design specifications, resulting in improved performance and quality in additive manufacturing processes [41]. In component design and reverse engineering, point cloud data support the creation of detailed 3D models of electronic components, including connectors, enclosures, and housings. The precision afforded by enhanced point clouds accelerates the prototyping process, enables optimized design workflows, and facilitates the reverse engineering of existing products. By replicating intricate geometries with high fidelity, they allow for the comprehensive analysis and reproduction of original designs [42].

This article is organized as follows: In Section 2, a brief explanation of the various accuracy parameters studied is provided, along with the presentation of the new methodology for optimizing point cloud data. Section 3 introduces a case study, where different parameters are analyzed separately to demonstrate their limitations in analyzing the model, and the robustness and the accuracy of the new method are presented. In Section 4, the robustness and accuracy of the new method are discussed across different clustering algorithms. Finally, the conclusions are drawn in Section 5.

2. Materials and Methods

The new method for optimizing the point cloud utilizes several accuracy parameters applied during both the acquisition phase and the image processing phase. The data related to these accuracy parameters are then analyzed using deep learning models, which cluster the optimized datasets. The dataset is obtained through photogrammetric processing in the photogrammetric processing software Agisoft Metashape [43], specifically an SfM software that allows for the processing of digital image sets and obtaining numerous outputs such as point clouds, 3D models, orthophotos, contour lines, DEM, and much more. Some of the parameters used are geometric parameters related to the acquisition phase (intersection angle and number of images), while the remaining are numerical parameters extracted from the SfM processing and, therefore, are potentially dependent on the software used (reprojection errors and accuracy projection).

2.1. Accuracy Parameters

2.1.1. Reprojection Error

The first parameter calculated is the reprojection error, a geometric error that represents the image distance between a projected point and its corresponding measured point. This error is used to evaluate how accurately a 3D point estimate replicates the true projection of the point. To compute the 3D coordinates of the tie point, the camera’s internal and external orientation parameters, along with the image coordinates of the point, are utilized. The reprojection error estimation can be seen in Figure 1.

2.1.2. The Angle Between Homologous Points

In this work, the Base/Height ratio is analyzed by estimating the angle between two lines of view that generate the 3D point called the angle of intersection or angle between homologous points, given the k-th tie point seen from two images i and i + 1 (see Figure 2).

2.1.3. Number of Images

Another estimated parameter is the number of images, which is the number of photogrammetric shots of the scene that have contributed to the reconstruction of the tie point in object space. This parameter is as follows:

N_{i} = \sum_{j} n_{j_{T P i}}

(1)

where n_jTPi is the number of cameras for the reconstruction of the i-th tie point.

2.1.4. Projection Accuracy

Another estimated parameter is projection accuracy, which allows us to recognize less reliable tie points. The projection accuracy parameter in Agisoft Metashape measures how precisely a tie point is positioned relative to its neighboring points within the point cloud. This precision is influenced by the scale at which the points were identified during processing. Metashape leverages scale information to adjust the weighting of reprojection errors for tie points, assigning higher or lower importance depending on the detail level at which the point is detected. The Sigma (σ) parameter determines the scale of key points, which represents the degree of Gaussian blur applied at a specific level of the scale pyramid. This parameter incorporates the local context of each point, affecting the treatment of reprojection errors and improving the robustness of the 3D reconstruction.

In essence, the projection accuracy parameter enhances the quality of the 3D model by balancing errors according to the resolution and scale at which the tie points are identified. This provides essential insights into the spatial consistency of the point cloud.

While the exact mathematical formula for projection accuracy in Metashape is proprietary, it aligns with the principles of photogrammetry and computer vision. The relationship can be summarized as follows:

{Error}_{proj} = W . ‖ P m e a n - P p r o j ‖

(2)

where Error_proj represents the weighted reprojection error, Pmeas denotes the position of the detected point in the image (measured point), and Pproj is the projected point’s position calculated from the 3D model. The symbol ∥⋅∥ indicates the Euclidean distance between the measured and projected points. The parameter w is the weight assigned to the reprojection error, determined by the scale (σ) from the SIFT level where the tie point is detected.

In Metashape, the weight is proportional to the scale of the key point, which corresponds to the scale pyramid level where the point was identified. Points detected at higher pyramid levels (more detailed scales) contribute more significantly to the model’s computation. This approach ensures that points identified with greater local precision have a more substantial impact on projection and model optimization than those identified at coarser scales. By incorporating these principles, Metashape refines the 3D reconstruction process, emphasizing the spatial consistency and accuracy of the resulting model.

2.1.5. Camera Distance—Tie Point

The last value taken into account in the analysis is the camera distance, tie point, which refers to the distance between the center of the i-th camera’s focal point, and the j-th tie point, which is located within the i-th image.

Except for the reprojection error and projection accuracy, other accuracy parameters depend heavily on the image acquisition phase, causing their values to vary significantly between projects.

2.2. Methodology

This article introduces a novel noise reduction method to optimize the 3D reconstruction models of CH and enhance the accuracy of damage detection models based on point cloud data. Rather than relying on a single accuracy-related parameter, the method simultaneously evaluates all calculated parameters. It develops datasets of the most accurate 3D points, considering the availability of nodes in the point cloud.

First, the new model uses VAE to reduce its dimension from five different parameters to two synthetic parameters.

VAE is a type of neural network used for dimensionality reduction, feature extraction, and generative modeling. Like a traditional autoencoder, it consists of two parts: an encoder that maps the input data to a probabilistic latent space by learning the parameters of a probability distribution (typically a Gaussian) and a decoder that reconstructs the original input from a sampled latent representation. The VAE aims to learn an efficient representation of the data and ensure that the latent space follows a predefined probabilistic structure, enabling meaningful sampling and interpolation. The encoder and the decoder are defined as multilayer perceptrons (MLPs). A layer of MLP encoder E_F is

E_F = σ(W_x + B)

(3)

where σ is an element-wise activation function, W is a weight matrix, and B is a bias vector. The analyzed features for each data point (X) in the input dataset of the MLP model consist of five elements, representing the accuracy parameters detailed in the previous section. Each row corresponds to the geometry of a 3D point within the point cloud. In the latent space of the proposed model, the feature dimensions are reduced from the original five input columns to two features. Reducing the feature dimensions and leveraging the probabilistic nature of a VAE offers several advantages and enhances the applicability of the method, as outlined below:

-: By compressing the data into a probabilistic latent space, the VAE not only reduces computational requirements but also facilitates sampling from the latent space, making it suitable for big data applications such as point cloud processing, which is the primary focus of this study. This improvement increases the model’s scalability and versatility.
-: The VAE transforms complex, diverse features from various factors into a smaller, cohesive set of probabilistic latent representations, improving interpretability and usability and enabling meaningful interpolations between data points.
-: The latent space representation generated by the VAE simplifies the data, removes noise, and provides a structured probabilistic foundation, enhancing the performance of downstream tasks such as clustering and anomaly detection.
-: The VAE’s latent space enables the detection of meaningful patterns, including nonlinear and probabilistic relationships, that may not be apparent in the original dimensions. This feature allows for more insightful analysis and the generation of new synthetic data samples.
-: Unlike traditional autoencoders, the VAE provides generative capabilities, enabling the creation of realistic new data samples from the latent space. This feature is particularly useful for augmenting datasets or exploring variations in the data. While dataset augmentation is not applied in this study, it represents a potential future direction for the authors’ research.
-: The VAE can be trained to handle missing data by learning the distribution of the data and reconstructing missing values. While this is not the focus of the current research, it represents a promising avenue for future work.

These factors collectively make the VAE an effective tool for enhancing the optimization of point cloud data for damage detection.

In this research, after applying the VAE, four main clustering machine-learning algorithms are employed in its latent space to compare and observe their robustness. The first algorithm is k-means clustering, which partitions data based on similarity. It operates by assigning each data point to the nearest cluster centroid and iteratively updating the centroids until convergence. To minimize the within-cluster variance, the objective is to find

{a r g m i n}_{s} \sum_{i = 0}^{k} \sum_{x \in S_{i}} {∥ x - μ_{i} ∥}^{2} = {a r g m i n}_{s} \sum_{i = 0}^{k} |S_{i}| V a r S_{i}

(4)

where S represents the set of clusters, k is the number of clusters, μ_i is the mean point of the i-th cluster, and x denotes the data points. The k-means algorithm is suitable for applications where the number of clusters is optimized, making it ideal for point cloud optimization when performing full 3D reconstruction of an entire structure. In the context of the VAE’s latent space, k-means can be effective for global damage detection, where the data are relatively well separated and the cluster centroids represent general patterns. However, k-means clustering assumes that clusters are spherical and of similar size, which may limit its effectiveness in more complex, non-linear data distributions often present in real-world datasets.

The Gaussian mixture model (GMM) allows for the creation of an optimized point cloud not only useful for full 3D reconstruction but also for detecting specific local damages. Moreover, it enables the assessment of global damage using a smaller, highly accurate subset of tie points. GMM is a probabilistic model that assumes the data are generated from a mixture of several Gaussian distributions. This method is particularly useful for data that may have overlapping clusters or complex distributions, as it allows for soft clustering where data points can belong to multiple clusters with varying probabilities. In the latent space of the VAE, GMM can be beneficial for detecting subtle variations in the data, but it may not perform as well as k-means or agglomerative clustering when the data are imbalanced or when the model is not well tuned. Despite this, GMM can still offer valuable insights for applications where the relationships between data points are more probabilistic and less deterministic. GMM with the formulation of the posterior distribution is given by

p (Θ| x) = \sum_{i = 1}^{k} ϕ_{i} N (μ_{i}, Σ_{i})

(5)

where ϕ and Σ are weights and covariance matrices, N is the number of observations, and k is the number of clusters.

Spectral clustering is another method employed in this research to analyze the latent space. This algorithm is a graph-based clustering technique that uses the eigenvalues of a similarity matrix to reduce the dimensionality of the data before applying a clustering algorithm like k-means. This method is particularly effective for identifying non-linear relationships in the data, making it well-suited for complex datasets where clusters are not necessarily spherical. In the VAE’s latent space, Spectral clustering can capture more intricate patterns and relationships, especially when the data exhibit non-convex shapes or varying densities. It is particularly useful when the underlying structure of the data is complex, and traditional methods like k-means may fail to capture the nuances of the distribution. However, Spectral clustering can be computationally expensive, especially for large datasets, and requires careful selection of the similarity measure and the number of clusters. Its ability to leverage graph theory makes it particularly useful in point cloud processing when the relationships between points are non-linear or when identifying regions of interest within a complex structure. In addition, as it is able to perform soft clustering, it is a good choice for local damage detection.

Agglomerative hierarchical clustering is the fourth method considered in this research. This bottom-up approach starts by treating each data point as its cluster and iteratively merges the closest clusters based on a chosen linkage criterion until a desired number of clusters is achieved or all points are merged into a single cluster. Agglomerative hierarchical clustering is particularly suited for datasets where the relationships between data points vary at different scales like point cloud data. In the VAE’s latent space, agglomerative clustering can provide valuable insights into damage detection, especially when the clusters exhibit hierarchical or nested structures. However, its computational complexity increases with the size of the dataset, which can be a limitation for large-scale applications.

The clustered data are then compared using evaluation metrics to analyze their robustness. Since there are no ground truth or labeled data available due to the nature of this study, external validation metrics cannot be applied. Therefore, three internal evaluation metrics are considered in this research.

The Silhouette Score is a measure of how similar each data point is to its cluster compared to other clusters. It combines both cohesion and separation. A higher Silhouette Score indicates better-defined clusters.

The Calinski–Harabasz Index measures the ratio of the sum of between-cluster dispersion to within-cluster dispersion. A higher value indicates better-defined clusters, with more separation between them.

The Davies–Bouldin Index evaluates the average similarity between each cluster and its most similar counterpart. A lower Davies–Bouldin score indicates better clustering, as it reflects smaller intra-cluster distances and larger inter-cluster distances.

For implementing the algorithm, the Python programming language (version 3.11.1) and the Scikit-learn library were used. A summary of the proposed methodology can be seen in Figure 3.

3. Results

The case study for this work is the Temple of Neptune, a Greek temple located in Paestum, Campania, Italy. Constructed in the fifth century B.C.E., the temple features six front columns and fourteen side columns. As one of the three best-preserved templates in the Greek world, it was surveyed using aerial photogrammetry by UAV in 2017 (see Figure 4).

The complex spatial articulation of the geometries makes the Temple of Neptune an ideal subject for evaluating the robustness of the new methodology. The UAV utilized for the survey was a hexacopter equipped with a three-axis gimbal and a Alpha 6500 camera (Sony Corporation, Tokyo, Japan), capturing a total of 908 photogrammetric images. A GNSS network with 11 Ground Control Points was incorporated to estimate the internal orientation parameters in Agisoft Metashape through a self-calibrating bundle adjustment. For the analysis, a standard section was selected, highlighted in red in Figure 5.

Gujsli et al. [44] analyzed accuracy parameters independently to demonstrate that relying on a single parameter is ineffective for noise reduction in point cloud data. In their study, they considered the 90th percentiles for reprojection errors (see Figure 6a), angles greater than 10° for the average intersection angle (see Figure 6b), and the use of more than 10 cameras for reconstructing each 3D point (see Figure 6c). They identified an optimal threshold for noise reduction at a projection accuracy of 10 (see Figure 6d). While increasing this threshold further reduces noise, it comes at the cost of losing valuable data and compromising the overall data integrity. This leads to reduced cloud density, negatively impacting the reconstructed object’s descriptive quality. A visualization of the point cloud corresponding to single-parameter analysis is shown in Figure 6.

To implement the new model, the data are first reduced to two dimensions using a VAE model. In the latent space of the encoder, clustering algorithms are applied. The hyperparameters of the VAE model used in this study are detailed in Table 1. The VAE architecture incorporates a probabilistic framework to map input data to a latent space, enabling both dimensionality reduction and generative capabilities. The encoder and decoder networks are designed with intermediate layers that utilize the ReLU (Rectified Linear Unit) activation function. ReLU introduces non-linearity by outputting the input directly if it is positive and zero otherwise. This choice of activation function is computationally efficient and helps mitigate the vanishing gradient problem, ensuring effective training of the deep learning model. Batch normalization and dropout regularization are applied to improve generalization and prevent overfitting. The encoder compresses the input data into a two-dimensional latent space, optimized for visualization and clustering tasks. The loss function combines reconstruction loss (mean squared error) with the Kullback–Leibler divergence, which ensures that the learned latent space approximates a standard normal distribution. This probabilistic framework allows the VAE to generate meaningful representations and handle noise effectively.

It is important to note that all clustering algorithms are configured with identical parameters. Specifically, the number of clusters is set to 10, with 10 initializations and a tolerance of 1 × 10⁻⁴, which determines the stopping criterion for the algorithm. A lower tolerance value indicates a stricter convergence requirement. The covariance type is set to “full,” meaning that each cluster is modeled with its full covariance matrix, offering greater flexibility in capturing the data’s shape and fitting it to the assigned clusters. Additionally, the initial weight settings are defined to provide reasonable initial estimates for cluster assignments and distribution parameters, which contribute to the overall performance and stability of the clustering process. The hyperparameters for each clustering model are selected through a combination of random search and experimental analysis. Moreover, experimental analysis is conducted by systematically testing various hyperparameter configurations and selecting the ones that yield the best results based on clustering evaluation metrics. This approach ensures that the chosen hyperparameters are optimal for each model and dataset. The analysis of clustering algorithms reveals that the average values of the parameters differ significantly between the four methods and the distribution of clusters produced by each algorithm varies when applied to the two features generated by the VAE model. The data points are clustered using four different clustering models, and the resulting clusters are depicted in Figure 7, with each cluster represented by one of ten distinct color tones.

The comparison of cluster information is presented in Table 2, Table 3, Table 4 and Table 5. It shows that all clustering algorithms produce consistent results. The number of tie points identified in the clusters enhances the point cloud density while maintaining its quality, enabling a more detailed description of the object.

The results of clustering using GMM, k-means, agglomerative clustering, and Spectral clustering are evaluated across three key metrics, Silhouette Score, Calinski–Harabasz Index, and Davies–Bouldin Index, and can be seen in Table 6. The evaluation metrics provide a quantitative basis for comparing clustering algorithms.

Silhouette Score: k-means and agglomerative clustering achieved the best scores, suggesting that they are more effective at identifying well-separated clusters.

Calinski–Harabasz Index: k-means achieved the highest score, indicating excellent inter-cluster separation.

Davies–Bouldin Index: the low Davies–Bouldin scores of k-means and agglomerative clustering confirm their ability to produce compact and distinct clusters.

The results indicate that k-means clustering is the most robust and effective method for analyzing the VAE latent space, followed closely by agglomerative hierarchical clustering. Both hard clustering methods outperform GMM and Spectral clustering in terms of cluster cohesion, separation, and overall quality. Spectral clustering can serve as a secondary choice for local damage detection with additional optimization as it is able to perform soft clustering, while GMM may not be appropriate without substantial modifications to its parameters or assumptions.

4. Discussion

As k-means demonstrates a better distribution of point data within the clusters, it is the best option when full 3D reconstruction is needed. However, k-means clustering faces two significant challenges. First, it lacks the ability to perform soft clustering, which limits its flexibility in scenarios where data points may belong to multiple clusters with varying probabilities. The rigidity of k-means can result in suboptimal clustering when the data are more complex and varied. This is to say that when only comparing of specific section of an object in temporal digital documents is needed, the complete representation of the structure is unnecessary. Instead, a dataset consisting of the most accurate points is required. Second, when the number of clusters is pre-defined, it can lead to uneven data distribution in k-means. K-means is inherently limited in its ability to accommodate varying numbers of clusters and is most effective when the number of clusters is predefined based on experimental results. As such, it is not well suited for generating smaller clusters with varying levels of accuracy. It could be more useful for the full representation of the data in 3D with a better overall structure and accuracy. These challenges make it less effective for applications requiring balanced clusters, such as local damage analysis. In contrast, Spectral clustering flexibility allows it to better adapt to the underlying distribution of the data, making it a more suitable choice for applications requiring a finer level of granularity. It proves especially useful when working with local condition assessments based on point cloud data. This method is capable of handling different levels of 3D reconstruction and detail in models while maintaining a decent level of accuracy. This makes Spectral clustering more suitable for localized damage assessment. Overall, while k-means and Spectral clustering methods have their advantages, Spectral clustering’s ability to model more complex structures with varying cluster shapes makes it particularly advantageous for detailed point cloud analysis; however, for full 3D reconstruction, k-means has better accuracy. The models developed in this study mark a significant advancement in the condition assessment of cultural heritage and have broader applicability in various fields requiring accurate 3D representations. Existing digital documents or point clouds often lack precision, particularly in areas with new damage. Since resurveying the original condition of structures is typically not feasible, these models provide a practical solution by enhancing the accuracy of available point clouds. This improvement is crucial not only for cultural heritage preservation, including structural assessments, restoration planning, and conservation efforts but also for applications in other fields. By enabling the refinement of digital representations without the need for additional surveys, the model addresses critical challenges where historical or baseline data are incomplete or outdated.

5. Conclusions

Given the lack of prior structural information, creating an accurate 3D model from available data is essential for point-cloud-based monitoring and condition assessment methods. This work proposes a novel methodology for reducing noise in tie point clouds, which is particularly valuable for the condition assessment and 3D reconstruction of cultural heritage sites. The proposed methodology is crucial for generating precise digital documentation, enabling effective comparison with the current conditions of the analyzed object and facilitating the identification of any new damage.

The proposed method introduces an innovative approach by utilizing a combination of multiple accuracy parameters rather than relying on a single metric. Initially, a variational autoencoder model reduces the features to only two features, and in this latent space, four clustering algorithms are applied. This analysis enables the simultaneous consideration of multiple accuracy parameters, improving the overall effectiveness of noise reduction in point clouds. Additionally, this study investigates the impact of these four widely used clustering algorithms through several evolutional metrics, aiming to establish the most robust methodology for noise reduction. To validate the robustness and applicability of the proposed approach, the Temple of Neptune is employed as a case study, demonstrating its potential to preserve the accuracy and integrity of 3D reconstructions for cultural heritage sites. K-means and agglomerative hierarchical clustering methods show comparable average accuracy values across features. Spectral clustering follows these methods but offers additional advantages, such as the ability to perform soft clustering by capturing complex relationships in the data and handling non-linear boundaries more effectively.

Future directions for this work include extending the model’s application across diverse disciplines, integrating additional data sources, and refining the algorithms to handle more complex damage scenarios. The model demonstrates significant potential beyond cultural heritage, with applicability in fields such as civil engineering, urban planning, environmental monitoring, and autonomous systems, where enhanced point cloud data can greatly improve accuracy and inform decision-making processes. By improving the accuracy of these digital models, we aim to contribute to the long-term preservation and protection of valuable assets across various fields, ensuring that structures, environments, and systems can be accurately assessed, maintained, and optimized for future generations. Furthermore, implementing a Siamese neural network is proposed for future research to enhance damage detection across various fields. This approach will allow for the comparison of point cloud datasets captured at different times from the same location, enabling the analysis of temporal changes to identify structural alterations, detect new damage, and monitor ongoing deterioration effectively in contexts such as cultural heritage, civil engineering, urban planning, and environmental monitoring.

Author Contributions

Methodology, S.E. and M.L.; validation, S.E. and M.L.; formal analysis, S.E. and M.L.; investigation, S.E.; writing—original draft preparation, S.E.; writing—review and editing, S.E. and M.L.; visualization, S.E. and M.L.; supervision, M.L.; project administration, M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data supporting this study’s findings are available upon reasonable request from the corresponding author, in accordance with our research group’s policies. Access to the data may be restricted to ensure proper usage and compliance with ethical considerations, particularly given the nature of the materials analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jiao, M.; Jiang, W.; Yuan, T.; Wang, J.; Peng, Y. A cross-modal high-resolution image generation approach based on cloud-terminal collaboration for low-altitude intelligent network. Future Gener. Comput. Syst. 2024, 161, 686–700. [Google Scholar] [CrossRef]
Freire, D.L.; de Carvalho, A.C.P.L.F.; Peterlevitz, A.J.; Chinelatto, M.A.; da Silva, R.D.; Perea, J.F.R. Assessing Advanced Computer Vision Techniques in Aerial Imagery: A Case Study on Transmission Tower Identification. In Lecture Notes in Computer Science, Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics; Springer Nature: Berlin/Heidelberg, Germany, 2024; Volume 14968, pp. 184–196. [Google Scholar] [CrossRef]
Puppala, H.; Maganti, L.S.; Peddinti, P.R.T.; Motapothula, M.R. Thermographic inspections of solar photovoltaic plants in India using Unmanned Aerial Vehicles: Analyzing the gap between theory and practice. Renew. Energy 2024, 237, 121694. [Google Scholar] [CrossRef]
Gao, K.; Li, G.; Chen, D.; Su, A.; Cao, Y.; Li, C.; Wu, G.; Du, Q.; Lin, J.; Wang, X.; et al. Pavement damage characteristics in the permafrost regions based on UAV images and airborne LiDAR data. Cold Reg. Sci. Technol. 2024, 228, 104313. [Google Scholar] [CrossRef]
Li, L.; Peng, Z.; Chen, Q.; Wang, Z.; Huang, Q.; Wang, B.; Cai, Q.; Fang, W.; Ma, S.; Zhang, Z. Mapping elevational patterns of functional diversity of canopy species in an alpine forest using drone multispectral and LiDAR data. Ecol. Indic. 2024, 169, 112965. [Google Scholar] [CrossRef]
Su, Z.; Wang, C.; Jiang, K.; Li, W. SITF: A Self-Supervised Iterative Training Framework for Point Cloud Denoising. CAD Comput. Aided Des. 2025, 179, 103812. [Google Scholar] [CrossRef]
Sanseverino, A.; Messina, B.; Limongiello, M.; Guida, C.G. An HBIM Methodology for the Accurate and Georeferenced Reconstruction of Urban Contexts Surveyed by UAV: The Case of the Castle of Charles V. Remote Sens. 2022, 14, 3688. [Google Scholar] [CrossRef]
Cheng, W.; Kim, E.; Ko, J.H. HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation. In Lecture Notes in Computer Science, Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics; Springer Nature: Berlin/Heidelberg, Germany, 2025; Volume 15146, pp. 35–52. [Google Scholar] [CrossRef]
Emadi, S.; Sun, Y.; Lozano-Galant, J.A.; Turmo, J. Observing Material Properties in Composite Structures from Actual Rotations. Appl. Sci. 2023, 13, 11456. [Google Scholar] [CrossRef]
Emadi, S.; Emadi, S. Analyzing Cost and Time Objectives in the Construction Projects Using Artificial Neural Network. Int. Rev. Civ. Eng. 2021, 13, 91–98. [Google Scholar] [CrossRef]
Emadi, S.; Ma, H.; Lozano-Galant, J.A.; Turmo, J. Simplified Calculation of Shear Rotations for First-Order Shear Deformation Theory in Deep Bridge Beams. Appl. Sci. 2023, 13, 3362. [Google Scholar] [CrossRef]
Khalife, S.; Emadi, S.; Wilner, D.; Hamzeh, F. Developing Project Value Attributes: A Proposed Process for Value Delivery on Construction Projects. In Proceedings of the 30th Annual Conference of the International Group for Lean Construction (IGLC30), Edmonton, AB, Canada, 25–31 July 2022; pp. 913–924. [Google Scholar] [CrossRef]
Lozano, F.; Emadi, S.; Komarizadehasl, S.; Arteaga, J.G.; Xia, Y. Enhancing the Accuracy of Low-Cost Inclinometers with Artificial Intelligence. Buildings 2024, 14, 519. [Google Scholar] [CrossRef]
di Filippo, A.; Antinozzi, S.; Limongiello, M.; Messina, B. An Effective Approach for Point Cloud Denoising in Integrated Surveys. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2024, XLVIII-2/W4-2024, 181–187. [Google Scholar] [CrossRef]
Bai, Y.; Li, G.; Gong, X.; Zhang, K.; Xiao, Q.; Yang, C.; Li, Z. Boosting Point Cloud Understanding through Graph Convolutional Network with Scale Measurement and High-Frequency Enhancement. Knowl. Based Syst. 2024, 306, 112715. [Google Scholar] [CrossRef]
Wu, J.; Xiang, L.; You, H.; Tang, L.; Gai, J. Plant-Denoising-Net (PDN): A Plant Point Cloud Denoising Network Based on Density Gradient Field Learning. ISPRS J. Photogramm. Remote Sens. 2024, 210, 282–299. [Google Scholar] [CrossRef]
Sohail, S.S.; Himeur, Y.; Kheddar, H.; Amira, A.; Fadli, F.; Atalla, S.; Copiaco, A.; Mansoor, W. Advancing 3D Point Cloud Understanding through Deep Transfer Learning: A Comprehensive Survey. Inf. Fusion 2025, 113, 102601. [Google Scholar] [CrossRef]
Zhang, K.; Zhang, A.; Wang, X.; Li, W. Deep-Learning-Based Point Cloud Completion Methods: A Review. Graph. Models 2024, 136, 101233. [Google Scholar] [CrossRef]
Zhu, Q.; Fan, L.; Weng, N. Advancements in Point Cloud Data Augmentation for Deep Learning: A Survey. Pattern Recognit. 2024, 153, 110532. [Google Scholar] [CrossRef]
Balado, J.; Garozzo, R.; Winiwarter, L.; Tilon, S. A systematic literature review of low-cost 3D mapping solutions. Inf. Fusion 2025, 114, 102656. [Google Scholar] [CrossRef]
Liu, H.; Wu, Y.; Li, A.; Deng, Y. Precision detection and identification method for apparent damage in timber components of historic buildings based on portable LiDAR equipment. J. Build. Eng. 2024, 98, 111050. [Google Scholar] [CrossRef]
Dong, Q.; Wei, T.; Wang, Y.; Zhang, Q. Intangible cultural heritage based on finite element analysis: Force analysis of Chinese traditional garden rockery construction. Herit. Sci. 2024, 12, 241. [Google Scholar] [CrossRef]
Zhou, K.C.; Harfouche, M.; Zheng, M.; Jönsson, J.; Lee, K.C.; Appel, R.; Reamey, P.; Doman, T.; Saliu, V.; Horstmeyer, G.; et al. Computational 3D topographic microscopy from terabytes of data per sample. J. Big Data 2024, 11, 62. [Google Scholar] [CrossRef]
Manakov, A.; Restrepo, J.; Klehm, O.; Hegedus, R.; Eisemann, E.; Seidel, H.P.; Ihrke, I. A Reconfigurable Camera Add-On for High Dynamic Range, Multispectral, Polarization, and Light-Field Imaging. ACM Trans. Graph. 2013, 32, 47:1–47:14. [Google Scholar] [CrossRef]
Kim, H.S.; Jeong, K.M.; Hong, S.I.; Jo, N.Y.; Park, J.H. Analysis of image distortion based on light ray field by multi-view and horizontal parallax only integral imaging display. Opt. Express 2012, 20, 23755–23768. [Google Scholar] [CrossRef] [PubMed]
Zong, W.; Wang, Z.; Xing, Q.; Zhu, J.; Wang, L.; Qin, K.; Bai, H.; Yu, M.; Dai, Z. The Method of Multi-Camera Layout in Motion Capture System for Diverse Small Animals. Appl. Sci. 2018, 8, 1562. [Google Scholar] [CrossRef]
Kraus, K. Photogrammetry: Geometry from Images and Laser Scans; Walter de Gruyter: Berlin, Germany, 2011. [Google Scholar]
Luengo, P. Architectural simulation from point clouds: Between precision and historical validity. Digit. Appl. Archaeol. Cult. Herit. 2024, 35, e00387. [Google Scholar] [CrossRef]
Li, Z.; Tian, C.; Yuan, H.; Lu, X.; Malekmohamadi, H. 3D-MSFC: A 3D multi-scale features compression method for object detection. Displays 2024, 85, 102880. [Google Scholar] [CrossRef]
Zhou, L.; Sun, G.; Li, Y.; Li, W.; Su, Z. Point Cloud Denoising Review: From Classical to Deep Learning-Based Approaches. Graph. Models 2022, 121, 101140. [Google Scholar] [CrossRef]
Centarti, L.; Ferreyra, C.; Guida, C.G.; Limongiello, M.; Messina, B. Documentation and Management of Complex 3D Morphologies through Digital Technology. Acta IMEKO 2024, 13, 1–7. [Google Scholar] [CrossRef]
Zhao, Z.; Zhang, Z.; Nie, Q.; Liu, C.; Zhu, H.; Chen, K.; Tang, D. Probing a Point Cloud-Based Expeditious Approach with Deep Learning for Constructing Digital Twin Models in Shopfloor. Adv. Eng. Inform. 2024, 62 Pt B, 102748. [Google Scholar] [CrossRef]
Chen, Q.; Ge, Y.; Tang, H. Rock Discontinuities Characterization from Large-Scale Point Clouds Using a Point-Based Deep Learning Method. Eng. Geol. 2024, 337, 107585. [Google Scholar] [CrossRef]
Li, C.; Zhou, J.; Du, K.; Tao, M. Enhanced Discontinuity Characterization in Hard Rock Pillars Using Point Cloud Completion and DBSCAN Clustering. Int. J. Rock Mech. Min. Sci. 2025, 186, 106005. [Google Scholar] [CrossRef]
Chowdhury, A.; Jahan, M.; Kaisar, S.; Khoda, M.E.; Rajin, S.M.A.K.; Naha, R. Coral Reef Surveillance with Machine Learning: A Review of Datasets, Techniques, and Challenges. Electronics 2024, 13, 5027. [Google Scholar] [CrossRef]
Liou, J.-L.; Liao, K.-C.; Wen, H.-T.; Wu, H.-Y. A Study on Nitrogen Oxide Emission Prediction in Taichung Thermal Power Plant Using Artificial Intelligence (AI) Model. Int. J. Hydrog. Energy 2024, 63, 1–9. [Google Scholar] [CrossRef]
Ran, C.; Zhang, X.; Han, S.; Yu, H.; Wang, S. TPDNet: A Point Cloud Data Denoising Method for Offshore Drilling Platforms and Its Application. Measurement 2025, 241, 115671. [Google Scholar] [CrossRef]
Han, S.; Yu, S.; Zhang, X.; Zhang, L.; Ran, C.; Zhang, Q.; Li, H. Research and Application on Deep Learning-Based Point Cloud Completion for Marine Structures with Point Coordinate Fusion and Coordinate-Supervised Point Cloud Generator. Measurement 2025, 242 Pt E, 116246. [Google Scholar] [CrossRef]
Romanengo, C.; Falcidieno, B.; Biasotti, S. From Aerial LiDAR Point Clouds to Multiscale Urban Representation Levels by a Parametric Resampling. Comput. Graph. 2024, 123, 104022. [Google Scholar] [CrossRef]
Qi, Y.; Liu, C.; Scaioni, M.; Li, Y.; Qiao, Y.; Ma, X.; Wu, H.; Zhang, K.; Wang, D. Geometric Information Constraint 3D Object Detection from LiDAR Point Cloud for Autonomous Vehicles under Adverse Weather. Transp. Res. Part C Emerg. Technol. 2024, 161, 104555. [Google Scholar] [CrossRef]
Yue, H.; Wang, Q.; Zhao, H.; Zeng, N.; Tan, Y. Deep Learning Applications for Point Clouds in the Construction Industry. Autom. Constr. 2024, 168 Pt A, 105769. [Google Scholar] [CrossRef]
Liao, K.-C.; Lau, J.; Hidayat, M. Aircraft Skin Damage Visual Testing System Using Lightweight Devices with YOLO: An Automated Real-Time Material Evaluation System. AI 2024, 5, 1793–1815. [Google Scholar] [CrossRef]
Agisoft LLC. Metashape, Version 1.7; Agisoft LLC.: St. Petersburg, Russia. Available online: https://www.agisoft.com/ (accessed on 31 December 2021).
Gujski, L.M.; di Filippo, A.; Limongiello, M. Machine learning clustering for point clouds optimization via feature analysis in cultural heritage. In Proceedings of the 9th International Workshop 3D-ARCH “3D Virtual Reconstruction and Visualization of Complex Architectures”, Mantua, Italy, 23–25 May 2022; Volume XLVI-2/W1-2022, pp. 1–2. [Google Scholar]

Figure 1. Illustration of the reprojection error.

Figure 2. Illustration of the angle of intersection.

Figure 3. Conceptual illustration of the proposed methodology, highlighting the main steps involved in optimizing point cloud data using deep learning clustering models.

Figure 4. Application example: Temple of Neptune in Paestum (Italy).

Figure 5. A selected section of the Temple of Neptune.

Figure 6. Visualizations of point cloud data under different single-parameter noise reduction analyses: (a) reprojection errors, (b) average intersection angles, (c) number of images, and (d) projection accuracy.

Figure 7. Distribution of clusters generated by: (a) GMM clustering algorithms, (b) k-means clustering algorithms, (c) agglomerative clustering algorithms, and (d) Spectral clustering algorithms.

Table 1. VAE model characteristics.

Characteristics	Variables
Batch Size	256
Encoder Layer	2
Decoder Layer	2
Latent Space	2
Epochs	100
Activation Function	ReLU
Learning Rate	0.001
Regularization	L2 (0.01)

Table 2. Cluster analysis by GMM.

Cluster_Gmm	Error_Proj_Mean	Ac_Mean	Cant_Fotos_Mean	Dist_Mean	Med_Ang_Mean	Number of Points
0	0.26	0.30	2.01	26.03	3.91	110,908
1	0.86	19.43	22.61	34.40	20.01	22,156
2	0.61	3.05	4.18	25.29	31.17	18,748
3	0.60	4.31	9.05	90.63	4.08	10,397
4	0.62	3.89	7.11	29.06	15.82	30,525
5	0.28	0.35	2.00	73.73	1.40	29,848
6	4.73	164.17	5.34	36.10	13.98	6654
7	0.55	2.06	4.90	29.24	9.23	69,201
8	0.51	1.34	3.35	89.67	2.88	27,163
9	0.43	0.88	2.92	30.12	5.68	93,225

Table 3. Cluster analysis by k-means.

Cluster_Kmeans	Error_Proj_Mean	Ac_Mean	Cant_Fotos_Mean	Dist_Mean	Med_Ang_Mean	Number of Points
0	0.44	1.45	3.57	79.95	2.51	22,466
1	0.37	0.90	3.058	28.88	3.81	182,239
2	0.73	6.12	5.61	25.87	31.87	21,908
3	0.60	3.75	6.47	27.20	18.11	32,756
4	16.12	1532.80	8.31	39.50	8.41	301
5	0.80	28.43	32.84	33.92	23.11	10,823
6	0.47	1.761	3.80	125.08	1.97	18,596
7	0.40	1.09	3.41	54.50	2.69	40,742
8	0.50	1.97	4.13	24.31	10.540	84,393
9	5.02	127.09	4.98	36.40	8.23	4601

Table 4. Cluster analysis by agglomerative hierarchical clustering algorithms.

Cluster_Agglo	Error_Proj_Mean	Ac_Mean	Cant_Fotos_Mean	Dist_Mean	Med_Ang_Mean	Number of Points
0	0.65	4.60	5.43	25.77	29.56	29,013
1	0.42	1.31	3.51	68.41	2.63	38,395
2	0.45	1.61	3.62	118.74	1.95	22,720
3	0.50	2.01	4.10	24.34	9.80	103,938
4	0.37	0.83	2.99	31.59	3.42	183,826
5	0.72	12.73	20.50	33.75	18.45	8725
6	14.99	1275.57	7.57	40.31	8.21	410
7	0.59	3.49	6.36	27.41	16.98	22,882
8	4.72	111.94	5.01	35.51	9.41	4754
9	0.92	53.00	50.42	34.81	28.73	4162

Table 5. Cluster analysis by Spectral clustering algorithms.

Cluster_Spectral	Error_Proj_Mean	Ac_Mean	Cant_Fotos_Mean	Dist_Mean	Med_Ang_Mean	Number of Points
0	0.42	1.44	3.81	64.90	2.73	58,771
1	0.52	2.23	4.56	24.96	11.74	79,712
2	1.09	25.77	11.01	28.98	23.43	60,423
3	0.35	0.67	2.66	36.76	2.150	49,870
4	0.40	1.00	3.28	23.87	6.28	70,836
5	0.39	1.21	2.87	27.69	3.36	79,697
6	0.29	0.36	2.13	4.97	3.88	199
7	0.33	0.50	3.20	30.20	3.87	10
8	0.40	0.88	2.62	104.58	1.76	10,982
9	0.50	2.16	4.14	147.08	1.97	8325

Table 6. Clustering model comparison by evaluation metrics.

Model	Silhouette	Calinski–Harabasz	Davies–Bouldin
GMM	0.017	157,947.57	4.16
KMeans	0.49	592,007.78	0.64
Agglomerative	0.48	540,785.91	0.68
Spectral	0.13	146,841.68	1.67

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Emadi, S.; Limongiello, M. Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models. Electronics 2025, 14, 399. https://doi.org/10.3390/electronics14020399

AMA Style

Emadi S, Limongiello M. Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models. Electronics. 2025; 14(2):399. https://doi.org/10.3390/electronics14020399

Chicago/Turabian Style

Emadi, Seyyedbehrad, and Marco Limongiello. 2025. "Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models" Electronics 14, no. 2: 399. https://doi.org/10.3390/electronics14020399

APA Style

Emadi, S., & Limongiello, M. (2025). Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models. Electronics, 14(2), 399. https://doi.org/10.3390/electronics14020399

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimizing 3D Point Cloud Reconstruction Through Integrating Deep Learning and Clustering Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Accuracy Parameters

2.1.1. Reprojection Error

2.1.2. The Angle Between Homologous Points

2.1.3. Number of Images

2.1.4. Projection Accuracy

2.1.5. Camera Distance—Tie Point

2.2. Methodology

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI