1. Introduction
The Earth’s land surface is a dynamic canvas on which human beings and natural systems are always interacting [
1]. Land-use–land-cover (LULC) classification and its dynamics, which partially result from land-surface processes, have considerable effects on biotic diversity, soil degradation, terrestrial ecosystems, and the ability of biological systems to support human needs. These changes also have consequences for the radiation budget, resulting in profound effects on regional and global climates [
2,
3,
4]. Thus, land-cover classification and its dynamics is an important field in environmental-change research at different scales. The efficient assessment and monitoring of land-cover changes are indispensable to advance our understanding of the mechanisms of change and model the effects of these changes on the environment and associated ecosystems at different scales [
5,
6,
7,
8,
9,
10].
Remote-sensing techniques represent some of the most effective tools to obtain information on LULC classification and dynamics (i.e., temporal-spatial changes and the transformation of landscapes) [
7,
11,
12,
13]. Many methods can detect land-cover changes based on optical and radar imagery with different spatial and spectral resolutions [
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24]. Existing techniques for accomplishing land-cover classification can be broadly grouped into three general types: (1) Supervised classification algorithms, such as the maximum likelihood, minimum distance, spectral angle mapping, and support vector machine methods, employ labeled training data or spectral measurements and ground-cover classes of interest; (2) Unsupervised classification methods, such as iterative self-organizing data analysis (ISODATA) techniques and k-means, are used to classify land-cover types without prior knowledge of the ground-cover classes of interest; (3) Combinations of supervised and unsupervised classification algorithms account for the remaining methods. These methods include an important assumption, namely, a pixel can only be classified into one category and the relationship between a pixel and type can only be a one-to-one relationship.
In some Boolean classification methods, e.g., the artificial neural network (ANN) method, an output node’s number corresponds to the number of pattern classes during the training course, and the output node that corresponds to the class of the training pattern vector is set to “1”, whereas all other output nodes are set to “0” [
25]. In many hazardous situations, however, classes are often fuzzy or ill defined. Thus, most traditional classifiers often fail to provide an adequate representation of the relationship between a pattern vector and its ‘belongingness’ to a particular class [
26]. Considering this aspect, an image pixel that corresponds to a ground entity does not represent only one category. Instead, this pixel corresponds to a mixture of two or more categories because of the resolutions of remote-sensing images and other factors. For example, a Thematic Mapper (TM) cell with dimensions of 30 m × 30 m that covers a portion of a residential area may include houses, meadows, and roads. If this cell is assigned a single type (house, grass, or road), the classification will contain significant errors. Classification algorithms that are based on fuzzy sets have been demonstrated to be more appropriate for land-cover change dynamics than most traditional Boolean classification algorithms [
27,
28,
29].
A large of amount of high spatial resolution and hyperspectral remotely sensed data is becoming available due to the fast development of satellite and sensor technology, the above-mentioned supervised and unsupervised classification methods could swiftly obtain the clustering information from the remote-sensing data, thus, these algorithms play an important role in remote-sensing application. Recently, concerning the classification based on high spatial resolution and hyperspectral remote-sensing data, many machine-learning algorithms such as neural networks (NN), support vector machines (SVMs), and decision trees have been used to the process of classifying remotely sensed images [
30,
31,
32]. However, most of the existing research work follows the traditional paradigm of pattern recognition. These algorithms used to image clustering consist of two steps: first, based on the raw data input, the complex handcrafted features are extracted, and second, the obtained features are used to learn classifiers. However, it is rarely known which features are important for the classification process due to the high diversity of depicted materials. Furthermore, for bigger datasets and many quite large remotely sensed images with very high spectral and spatial resolution, some deep learning methods or frameworks seems to more effectively fit and address the classification problems. The recent techniques based on deep learning have shown promising results for the classification of hyperspectral data like the convolutional neural network (CNN) and automatic encoder (AE) methods [
33,
34,
35,
36].
During image segmentation with fuzzy classification, a record of the degree to which any considered pixel belongs to a certain cluster is retained [
37]. Traditional clustering algorithms, such as fuzzy c-means (FCM), kernel FCM, and k-means, are all type-I FCM classification methods. Most of these methods quantify the degree of similarity between the data points and the corresponding membership degree based on the Euclidean distance [
29,
38,
39]. The FCM spatial information from FCM methods has been used to enhance algorithms to segment remote-sensing imagery in the presence of noise [
40,
41]. A novel semi-supervised fuzzy c-means (RSFCM) classification method was proposed to detect an increased proportion of changes and suppress noise through the synergistic exploitation of pseudo labels from difference images and spatial information [
39]. An adaptive spatial-information-based fuzzy clustering method to segment an image that addresses sensitivity to noisy information and a lack of spatial information has proven to be helpful in improving the robustness of traditional FCM methods [
42].
However, conventional type-I FCM classification methods, including the FCM, kernel FCM, and k-means approach, often display suboptimal performance when applied to data that exhibit complex geometry because they fail to handle and quantify uncertainty when determining their membership functions [
43]. In contrast, the concept of a type-2 fuzzy set (TIIFS) was first introduced by [
43] as an extension of the concept of an ordinary fuzzy set (henceforth called a type-1 fuzzy set (TIFS)). The membership degree of a TIFS is crisp, whereas a TIIFS is a “fuzzy-fuzzy set” because of its fuzzy membership degrees. Therefore, TIIFSs are particularly useful when determining an exact membership function for a fuzzy set is difficult; hence, TIIFSs have unique advantages in characterizing the uncertainty in hyperspectral image data from the sensors and other environmental factors, including the weather conditions. These sets are used for image classification via the interval type-II fuzzy c-means (IT2FCM) method [
44,
45,
46,
47].
At present, very few studies in the literature that focus on land-cover classification have employed FCM based on TIIFSs, especially with hyperspectral images. A fuzzy number refers to a connected set of possible values and is a generalization of a regular real number that does not refer to a single value. This behavior is a common natural phenomenon; in particular, the spectra of geographical features on the surface display similar behavior. The spectrum of one geographic feature is considered to be a connected set of possible and similar spectral curves, that is, a spectrum with a certain width, similar to a band. Existing FCM methods based on TIIFSs, e.g., interval type-II fuzzy c-means (IT2FCM), fail to consider the width of such bands, only using the ranking of the average values of the upper and lower membership degree to determine if the pixel under consideration belongs to a specific class, and these methods never consider the possibility-based interval-number ranking.
Hence, according to above analysis, comparing with the FCM and IT2FCM, this paper will propose an improved interval type-II fuzzy c-means called IT2FCM* which improves on IT2FCM by incorporating interval number ranking methods, interval number distances, and water index to address the uncertainties for hyperspectral remote-sensing imagery clustering. This is the main objective of this study. In order to validate the separability of IT2FCM* algorithm comparing to two other fuzzy methods, FCM and IT2FCM, four clustering validity indexes are used: the partition coefficient index (PC), the Fukuyama and Sugeno index (FS), the Xie and Beni index (XB), and the partition entropy (PE). These validity indexes for the FCM, IT2FCM, and IT2FCM* algorithms are calculated based on different spectral and spatial resolution remotely sensed datasets. As a second objective of this paper, a comparative analysis of variation of their values is made to show and judge the performance of these three clustering fuzzy algorithms.
3. Experimental Results
Considering the improved IT2FCM* algorithm is proposed for hyperspectral remotely sensed imagery clustering. It is necessary to use some famous and well-known hyperspectral datasets such as the Pavia University datasets, Washington HYDICE datasets, and so on. These hyperspectral datasets are all airborne remote-sensing imagery. To test the separability of the proposed IT2FCM* algorithm applying the satellite hyperspectral dataset, the EO-1 Hyperion satellite dataset is used in this study due to its free cost for downloading from the NASA official website. These datasets were used to test the accuracy of classification based on FCM, IT2FCM, and IT2FCM*. In this section, the membership values are firstly calculated from the IT2FCM* algorithm before classifying these images. Using the interval-number-ranking technique based on fuzzy membership values of different land cover types, the results with optimal membership fuzzy value are reported; the corresponding classification results from these three different remotely sensed datasets were done based on the FCM, IT2FCM, and IT2FCM* algorithms.
3.1. Images of Membership Values of Different Classes
In this part, many membership values maps are produced from above three remotely sensed datasets from the IT2FCM* algorithm; here, the 191-band hyperspectral HYDICE dataset with a spatial resolution of 3 m is taken as the example, and its membership values are calculated. The maximum and minimum membership values of different classes are shown in
Figure 4. From the maximum membership value images shown in maps (see
Figure 4(a2,b2,c2,d2,e2,f2)), it is obvious that almost all of the classes are well classified. Even these classes do not exhibit a minimum fuzzy membership (see
Figure 4(a1,b,c1,d1,e1,f1)) close to 0. In fact, the maximum fuzzy membership values shown in
Figure 4 of all classes are close to 1 in all maps. From this result, it is notably that all these land cover types were clearly differentiated.
3.2. HYDICE Dataset and Classification Results
The Hyperspectral Digital Imagery Collection Experiment (HYDICE) hyperspectral dataset is a 191-band raw digital number hyperspectral image. The study area is located in the Washington D.C. Mall area in the U.S.A. This dataset was collected by the HYDICE sensor on 23 August 1995 (see
Figure 5a). As a push broom aircraft sensor system, the HYDICE instrument operates within the spectral range from 400 to 2500 nm with 210 spectral bands. The spectral resolution of the HYDICE sensor is approximately 10 nm [
66]. After several noisy bands were removed, the final image contained 191 spectral bands [
40]. For more information, see [
67]. The land cover within the study area was classified into six types by using the improved IT2FCM* algorithm. These types were sparse grassland, dense grassland, trees, bare soil and buildings, roads, and shadows (see
Figure 5b). As is mentioned in above section, the water is masked, thus, during the post-classification processing, the water is incorporated into the results of classes, then the results of classes of the study area were organized into seven classes including sparse grassland, dense grassland, trees, water, bare soil and buildings, roads, and shadows.
To validate the results of land-cover classification based on the IT2FCM* algorithm, the ground truth dataset is collected, and the testing image and reference data are provided in
Figure 5b. The confusion matrix was calculated based on regions of interest (ROIs). The accuracy of classification results (see
Figure 5c–e) based on the FCM, IT2FCM, and IT2FCM* algorithms are calculated based on these matrixes (see
Table 2). We also calculated the overall accuracy (OA) and kappa coefficient (KC) from the confusion matrix. The results showed that the overall accuracy of the results when using the image with 191 bands was 96.2% and the kappa coefficient was 0.95. Moreover, we conducted a comparative analysis of the classification results with higher-spatial-resolution aerial image data to further show the performance of the improved IT2FCM* algorithm.
Notably, comparing these three maps of classification results, it can be found that the classification map based on the improved IT2FCM* algorithm is the finest. Within the region in the white circle, regarding to the results of FCM and IT2FCM, some dense grassland are mistaken as the trees, while the results of IT2FCM* method is more fit the practical situation. Comparing these maps, within the regions in the white rectangle, it is notable that the shadows classified by the FCM and IT2FCM algorithm are more than those by the IT2FCM* algorithm, as some parts of dark buildings are mistaken as shadows, and typically, in some parts of road, the shadows are over estimated by the FCM and IT2FCM methods. Within the regions in the white diamond, results of FCM and IT2FCM algorithm, some parts of roads are mistaken as the buildings. Thus, in general, from the classification results, the IT2FCM* algorithm has the best performance with the hyperspectral HYDICE datasets.
3.3. Pavia University Dataset and the Classification Results
The Pavia University dataset was captured by reflective optics spectrographic imaging system (ROSIS) airborne instrument on the city of Pavia (see
Figure 6a). This instrument has 115 spectral channels with spectral region covering from 0.43 to 0.86 um, and the spatial resolution is 1.3 m per pixel. Due to the impacts of noise, 12 channels have removed, and the remaining 103 bands are further processed, and atmospheric correction was done [
31,
68]. This airborne dataset covers an area of the Engineering School of Pavia University, which consists of nine different classes, including asphalt, bitumen, metal sheet, gravel, bricks, soil, shadow, meadow, and trees.
The Pavia University hyperspectral remote-sensing dataset is classified based on the three fuzzy clustering methods; the results are shown by
Figure 6c–e. From
Figure 6, comparing
Figure 6d and 6e, the accuracy of FCM method is much lower than that of the other two fuzzy algorithms. To evaluate the results of classification, the ground truth ground dataset (see
Figure 6b) was used, and the confusion matrix was calculated. Then, the accuracy was estimated based on the confusion matrixes (see
Table 3). From
Table 3, the accuracy value of IT2FCM* algorithm is the highest among these three methods, and almost all the values of accuracy of each class are notably higher than those of the other two fuzzy means. This result indicates the IT2FCM* algorithm has a good performance with the Pavia University hyperspectral remote-sensing data.
3.4. Hyperion Dataset and Classification Results
This section introduces a satellite hyperspectral remote-sensing images, Hyperion image, acquired by Hyperion instrument, board on EO-1 satellite. Considering the free cost of the dataset and easy access to downloading, the Hyperion hyperspectral image of study area of Tianjin, north China, is used. The Hyperion instrument is a high-resolution hyperspectral imager capable of resolving 242 spectral bands ranging from 0.4 to 2.5 μm with a 30 m resolution. This instrument images a 7.5 km × 100 km surface area [
69]. During the processing of the Hyperion image, after removal of the bad spectral bands, calibration, and the atmospheric and geometric correction, about 179 bands remained. The relating algorithms for calibration and atmospheric and geometric correction are provided in [
70,
71]. From the 179-band image, a further reduced set of ‘stable’ bands could be selected for further analysis. The basis for selection of these ‘stable’ Hyperion bands and the set of stable bands are provided by [
72].
Figure 7a shows the study area. The main land cover of this study area includes water, grassland, cropland, bare soil, and impervious surface. To evaluate the accuracy of the classification results, the ground truth data was collected (see
Figure 7b).
The landscape of the study area is classified into water, grassland, cropland, bare soil, and impervious surface based on Hyperion images from the FCM, IT2FCM, and IT2FCM* algorithms, which are shown in
Figure 7c–e. To estimate the accuracy of the results from the above three fuzzy methods, the confusion matrix were calculated, the accuracy of each class, OA and KC, are also measured based on these confusion matrix (see
Table 4). This table notably shows that the accuracy of IT2FCM* is higher than the other two fuzzy means and has a great improvement compared to the accuracy of the FCM method.
5. Conclusions
An improved IT2FCM* algorithm based on type-II fuzzy sets was developed in this paper. This algorithm is intended for use in remote-sensing image classification based on hyperspectral datasets. In the improved type-II fuzzy approach, the ranking of interval number and handling of spectral uncertainty are considered. This is different from those of other fuzzy methods like FCM, IT2FCM, and other traditional supervised classification methods. The advantages of the IT2FCM* algorithm over other methods improve the separability and accuracy of the new method relative to traditional methods. The results also demonstrate this fact. Based on the membership values calculated from the IT2FCM* method, it is notable that this algorithm shows a better separability of different land cover classes. From the results, regarding to the Washington HYDICE, Pavia University, and EO-1 Hyperion hyperspectral image classification, the accuracy of classification of the IT2FCM* algorithm is higher than the accuracy of the FCM and IT2FCM methods. The results also show that the improved IT2FCM* algorithm has optimal performance among these three fuzzy clustering methods due to its separability to produce a finer outputs image of different land-cover types. To comparatively test the performance of FCM, IT2FCM, and improved IT2FCM* algorithms and to test their consistency for different spectral and spatial resolution hyperspectral datasets, four fuzzy validity indexes are introduced. From the results, in general, comparing to the other two fuzzy methods, the value of PC, FS, and XB from the improved IT2FCM* algorithm were improved significantly, and the value of PE had a slight change. This not only demonstrates a good consistency of the IT2FCM* algorithm with FCM and IT2FCM methods, but it also shows that the improved IT2FCM* algorithm behaves with a better performance in image clustering than the other two fuzzy methods. After all, the IT2FCM* is the inheritance and development of IT2FCM.
Generally, the improved IT2FCM* classification approach showed better separability and accuracy than the traditional FCM and IT2FCM methods. The quantitative performance indexes and graphical outputs demonstrated that the improved IT2FCM* approach significantly outperformed the competing classifiers and is therefore a superior alternative to hyperspectral image classification for use in future research and corresponding applications. However, another problem that needs to be resolved in the future is the computation efficiency. As known to us, regarding to the traditional hard classification methods, nothing about the fuzzy membership needs to be considered, while in the FCM method, the fuzzy membership must be considered during the image clustering, and in the IT2FCM and IT2FCM* algorithms, which are based on type-II fuzzy set, the upper and lower membership degree must be considered. Thus, the computational complexity of IT2FCMs is higher than the normal FCMs. Consequently, the computational efficiencies of the IT2FCM and IT2FCM* algorithms are lower than the computational efficiency of FCM, and are further lower than the computational efficiency of traditional hard classification methods. In addition to the unique advantages of using remote-sensing techniques and hyperspectral remotely sensed datasets for land-cover classification detection, we must be aware of the deficiencies and limitations of this method to better use satellite remote-sensing data. Although satellite-based remote sensing cannot provide information at the high level of detail that is possible in field surveys, this approach provides an alternative for researchers to address the land cover classification to continuously monitor the land-surface change dynamics over a large or local area and provide researchers with valuable, necessary, and complementary information.
In a future study, the IT2FCM* algorithm will be used to more multi/hyperspectral remote-sensing datasets, and its performance with different datasets will be further estimated. This approach will also be used to longtime series of satellite or aerial remotely sensed datasets to further test the separability and to provide an alternative method in addressing the spatiotemporal land cover classification. Besides this, we should be aware that some other materials should be considered to better use multi/hyperspectral remote-sensing techniques to monitoring LULC dynamics based on image classification. Aerial photographs or more field-survey data for additional years should be collected to strengthen and evaluate the results, which is especially important for remote-sensing image clustering, although these data are rarely collected because of economic limitations and a lack of some necessary equipment. Due to the computational complexity of IT2 FS, the computational complexity of IT2 FCMs is higher than normal FCMs. It is very important to improve their computational efficiency, and we will study this problem in the next step. Recently, many researchers have used the nearest-neighbor method combined with spatial information to optimize the IT2FCM algorithm, but the scale effect has never been considered. Therefore, in future work, the scale effect of the surface will be considered, and the spatial information will be utilized to further optimize the IT2FCM* algorithm.