skip to main content
research-article

Fusing multi-scale fuzzy information to detect outliers

Published: 04 March 2024 Publication History

Abstract

Outlier detection aims to find objects that behave differently from the majority of the data. Existing unsupervised approaches often process data with a single scale, which may not capture the multi-scale nature of the data. In this paper, we propose a novel information fusion model based on multi-scale fuzzy granules and an unsupervised outlier detection algorithm with the fuzzy rough set theory. First, a multi-scale information fusion model is formulated based on fuzzy granules. Then we employ fuzzy approximations to define the outlier factor of multi-scale fuzzy granules centered at each data point. Finally, the outlier score is calculated by aggregating the outlier factors of a set of multi-scale fuzzy granules. Experimental results demonstrate that the proposed method is comparable with or better than the leading outlier detection methods. The codes and datasets are publicly available online at https://github.com/ChenBaiyang/MFIOD.

Highlights

A novel information fusion model based on multi-scale fuzzy granules is formulated.
An unsupervised outlier detection algorithm that integrates multi-scale information is proposed.
Extensive experiments indicate the effectiveness and advantages of our method over some state-of-the-arts.

References

[1]
Pourhabibi T., Ong K.-L., Kam B.H., Boo Y.L., Fraud detection: A systematic literature review of graph-based anomaly detection approaches, Decis. Support Syst. 133 (2020).
[2]
Dey S., Ye Q., Sampalli S., A machine learning based intrusion detection scheme for data fusion in mobile clouds involving heterogeneous client networks, Inf. Fusion 49 (2019) 205–215.
[3]
Wang B., Mao Z., Outlier detection based on Gaussian process with application to industrial processes, Appl. Soft Comput. 76 (2019) 505–516.
[4]
Hawkins D.M., Identification of Outliers, Springer, 1980.
[5]
Dubois D., Prade H., Rough fuzzy sets and fuzzy rough sets, Int. J. Gen. Syst. 17 (2–3) (1990) 191–209.
[6]
Jiang F., Sui Y., Zhou L., A relative decision entropy-based feature selection approach, Pattern Recognit. 48 (7) (2015) 2151–2163.
[7]
Wang C., Qian Y., Ding W., Fan X., Feature selection with fuzzy-rough minimum classification error criterion, IEEE Trans. Fuzzy Syst. 30 (8) (2022) 2930–2942.
[8]
Sang B., Xu W., Chen H., Li T., Active anti-noise fuzzy dominance rough feature selection using adaptive K-nearest neighbors, IEEE Trans. Fuzzy Syst. (2023) 1–15.
[9]
Yuan Z., Chen H.M., Luo C., Peng D.Z., MFGAD: Multi-fuzzy granules anomaly detection, Inf. Fusion 95 (2023) 17–25.
[10]
Zhang X., Yuan Z., Miao D., Outlier detection using three-way neighborhood characteristic regions and corresponding fusion measurement, IEEE Trans. Knowl. Data Eng. (2023).
[11]
Zhang P., Li T., Wang G., Wang D., Lai P., Zhang F., A multi-source information fusion model for outlier detection, Inf. Fusion 93 (2023) 192–208.
[12]
Li F., Hu B.Q., A new approach of optimal scale selection to multi-scale decision tables, Inform. Sci. 381 (2017) 193–208.
[13]
Wu W.-Z., Leung Y., Theory and applications of granular labelled partitions in multi-scale decision tables, Inform. Sci. 181 (18) (2011) 3878–3897.
[14]
Zhang Q., Cheng Y., Zhao F., Wang G., Xia S., Optimal scale combination selection integrating three-way decision with hasse diagram, IEEE Trans. Neural Netw. Learn. Syst. 33 (8) (2022) 3675–3689.
[15]
Sun L., Si S., Ding W., Wang X., Xu J., TFSFB: Two-stage feature selection via fusing fuzzy multi-neighborhood rough set with binary whale optimization for imbalanced data, Inf. Fusion 95 (2023) 91–108.
[16]
Wu W.-Z., Qian Y., Li T.-J., Gu S.-M., On rule acquisition in incomplete multi-scale decision tables, Inform. Sci. 378 (2017) 282–302.
[17]
Cheng Y., Zhang Q., Wang G., Hu B.Q., Optimal scale selection and attribute reduction in multi-scale decision tables based on three-way decision, Inform. Sci. 541 (2020) 36–59.
[18]
X. Yang, L.J. Latecki, D. Pokrajac, Outlier Detection with Globally Optimal Exemplar-Based GMM, in: Proceedings of SIAM International Conference on Data Mining, SDM, 2009, pp. 145–154.
[19]
Goldstein M., Dengel A., Histogram-based outlier score (HBOS): A fast unsupervised anomaly detection algorithm, KI 2012: Advances in Artificial Intelligence, vol. 9, 2012.
[20]
Li Z., Zhao Y., Hu X., Botta N., Ionescu C., Chen G., ECOD: Unsupervised outlier detection using empirical cumulative distribution functions, IEEE Trans. Knowl. Data Eng. 35 (12) (2022) 12181–12193.
[21]
Liu F.T., Ting K.M., Zhou Z.-H., Isolation forest, in: 8th IEEE International Conference on Data Mining, 2008, pp. 413–422.
[22]
K. Zhang, M. Hutter, H. Jin, A new local distance-based outlier detection approach for scattered real-world data, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2009, pp. 813–822.
[23]
Angiulli F., Basta S., Lodi S., Sartori C., Reducing distance computations for distance-based outliers, Expert Syst. Appl. 147 (2020).
[24]
Breunig M.M., Kriegel H.P., Ng R.T., Sander J., LOF: Identifying density-based local outliers, Acm Sigmod Rec. 29 (2) (2000) 93–104.
[25]
H.-P. Kriegel, P. Kröger, E. Schubert, A. Zimek, LoOP: Local Outlier Probabilities, in: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM, 2009, pp. 1649–1652.
[26]
Zhao X., Liang J., Cao F., A simple and effective outlier detection algorithm for categorical data, Int. J. Mach. Learn. Cybern. 5 (3) (2014) 469–477.
[27]
Huang J., Zhu Q., Yang L., Feng J., A non-parameter outlier detection algorithm based on natural neighbor, Knowl.-Based Syst. 92 (2016) 71–77.
[28]
Hu W., Gao J., Li B., Wu O., Du J., Maybank S., Anomaly detection using local kernel density estimation and context-based regression, IEEE Trans. Knowl. Data Eng. 32 (2) (2020) 218–233.
[29]
Li K., Gao X., Fu S., Diao X., Ye P., Xue B., Yu J., Huang Z., Robust outlier detection based on the changing rate of directed density ratio, Expert Syst. Appl. 207 (2022).
[30]
He Z.Y., Xu X.F., Deng S.C., Discovering cluster-based local outliers, Pattern Recognit. Lett. 24 (9–10) (2003) 1641–1650.
[31]
Duan L., Xu L., Liu Y., Lee J., Cluster-based outlier detection, Ann. Oper. Res. 168 (1) (2009) 151–168.
[32]
Huang J., Zhu Q., Yang L., Cheng D., Wu Q., A novel outlier cluster detection algorithm without top-n parameter, Knowl.-Based Syst. 121 (2017) 32–40.
[33]
Y.M. Chen, D.Q. Miao, R.Z. Wang, Outlier Detection Based on Granular Computing, in: International Conference on Rough Sets and Current Trends in Computing, 2008, pp. 283–292.
[34]
Jiang F., Sui Y.F., Cao C.G., An information entropy-based approach to outlier detection in rough sets, Expert Syst. Appl. 37 (9) (2010) 6338–6344.
[35]
Jiang F., Chen Y.M., Outlier detection based on granular computing and rough set theory, Appl. Intell. 42 (2) (2015) 303–322.
[36]
Singh M., Pamula R., An outlier detection approach in large-scale data stream using rough set, Neural Comput. Appl. 32 (13) (2020) 9113–9127.
[37]
Chen Y.M., Miao D.Q., Zhang H.Y., Neighborhood outlier detection, Expert Syst. Appl. 37 (12) (2010) 8745–8749.
[38]
Yuan Z., Chen H., Li T., Sang B., Wang S., Outlier detection based on fuzzy rough granules in mixed attribute data, IEEE Trans. Cybern. 52 (8) (2022) 8399–8412.
[39]
Yuan Z., Chen B., Liu J., Chen H., Peng D., Li P., Anomaly detection based on weighted fuzzy-rough density, Appl. Soft Comput. 134 (2023).
[40]
Mi Y., Wang Z., Liu H., Qu Y., Yu G., Shi Y., Divide and conquer: A granular concept-cognitive computing system for dynamic classification decision making, European J. Oper. Res. 308 (1) (2023) 255–273.
[41]
Yuan Z., Chen H.M., Xie P., Zhang P.F., Liu J., Li T.R., Attribute reduction methods in fuzzy rough set theory: An overview, comparative experiments, and new directions, Appl. Soft Comput. (2021).
[42]
Hu Q., Liu J., Yu D., Mixed feature selection based on granulation and approximation, Knowl.-Based Syst. 21 (4) (2008) 294–304.
[43]
Wang C., Huang Y., Shao M., Fan X., Fuzzy rough set-based attribute reduction using distance measures, Knowl.-Based Syst. 164 (2019) 205–212.
[44]
Campos G.O., Zimek A., Sander J., Campello R.J., Micenková B., Schubert E., Assent I., Houle M.E., On the evaluation of unsupervised outlier detection: Measures, datasets, and an empirical study, Data Min. Knowl. Discov. 30 (4) (2016) 891–927.
[45]
Almardeny Y., Boujnah N., Cleary F., A novel outlier detection method for multivariate data, IEEE Trans. Knowl. Data Eng. 34 (9) (2022) 4052–4062.
[46]
Dai J., Zou X., Qian Y., Wang X., Multifuzzy β-covering approximation spaces and their information measures, IEEE Trans. Fuzzy Syst. 31 (3) (2023) 955–969.

Index Terms

  1. Fusing multi-scale fuzzy information to detect outliers
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Information Fusion
        Information Fusion  Volume 103, Issue C
        Mar 2024
        918 pages

        Publisher

        Elsevier Science Publishers B. V.

        Netherlands

        Publication History

        Published: 04 March 2024

        Author Tags

        1. Outlier detection
        2. Information fusion
        3. Fuzzy rough sets
        4. Multi-scale fuzzy granules
        5. Fuzzy approximation

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 01 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media