research-article

Neurofuzzy semantic similarity measurement

Authors:

Jorge Martinez-Gil,

Abdelkader HameurlainAuthors Info & Claims

Volume 145, Issue C

https://doi.org/10.1016/j.datak.2023.102155

Published: 01 May 2023 Publication History

Abstract

Automatically identifying the degree of semantic similarity between two small pieces of text has grown in importance recently. Its impact on various computer-related domains and recent breakthroughs in neural computation has increased the opportunities for better solutions to be developed. This work contributes a neurofuzzy approach for semantic textual similarity that uses neural networks and fuzzy logics. The idea is to combine the capabilities of the deep neural models for working with text with the ones from fuzzy logic for aggregating numerical data. The results of our experiments suggest that such an approach can accurately determine semantic similarity.

References

[1]

Harispe S., Ranwez S., Janaqi S., Montmain J., Semantic Similarity from Natural Language and Ontology Analysis, in: Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, 2015,.

[2]

Lastra-Díaz J.J., García-Serrano A., Batet M., Fernández M., Chirigati F., HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset, Inf. Syst. 66 (2017) 97–118,.

Digital Library

[3]

Palma G., Vidal M., Haag E., Raschid L., Thor A., Determining similarity of scientific entities in annotation datasets, Database J. Biol. Databases Curation 2015 (2015),.

[4]

Majumder G., Pakray P., Das R., Pinto D., Interpretable semantic textual similarity of sentences using alignment of chunks with classification and regression, Appl. Intell. 51 (10) (2021) 7322–7349,.

Digital Library

[5]

Paul C., Rettinger A., Mogadala A., Knoblock C.A., Szekely P.A., Efficient graph-based document similarity, in: Sack H., Blomqvist E., d’Aquin M., Ghidini C., Ponzetto S.P., Lange C. (Eds.), The Semantic Web. Latest Advances and New Domains - 13th International Conference, ESWC 2016, Heraklion, Crete, Greece, May 29 - June 2, 2016, Proceedings, in: Lecture Notes in Computer Science, 9678, Springer, 2016, pp. 334–349,.

Digital Library

[6]

Cer D., Yang Y., Kong S., Hua N., Limtiaco N., John R.S., Constant N., Guajardo-Cespedes M., Yuan S., Tar C., Strope B., Kurzweil R., Universal sentence encoder for english, in: Blanco E., Lu W. (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018: System Demonstrations, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, 2018, pp. 169–174,.

[7]

Devlin J., Chang M.-W., Lee K., Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, 2018, arXiv preprint arXiv:1810.04805.

[8]

M.E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep contextualized word representations, in: Proceedings of NAACL-HLT, 2018, pp. 2227–2237.

[9]

Martinez-Gil J., Chaves-Gonzalez J.M., A novel method based on symbolic regression for interpretable semantic similarity measurement, Expert Syst. Appl. 160 (2020),.

[10]

de Campos Souza P.V., Fuzzy neural networks and neuro-fuzzy networks: A review the main techniques and applications used in the literature, Appl. Soft Comput. 92 (2020),.

[11]

Rutkowski L., Cpalka K., Flexible neuro-fuzzy systems, IEEE Trans. Neural Netw. 14 (3) (2003) 554–574,.

Digital Library

[12]

Martinez-Gil J., Mokadem R., Küng J., Hameurlain A., A novel neurofuzzy approach for semantic similarity measurement, in: Golfarelli M., Wrembel R., Kotsis G., Tjoa A.M., Khalil I. (Eds.), Big Data Analytics and Knowledge Discovery - 23rd International Conference, DaWaK 2021, Virtual Event, September 27-30, 2021, Proceedings, in: Lecture Notes in Computer Science, 12925, Springer, 2021, pp. 192–203,.

Digital Library

[13]

Li Y., Bandar Z., McLean D., An approach for measuring semantic similarity between words using multiple information sources, IEEE Trans. Knowl. Data Eng. 15 (4) (2003) 871–882,.

Digital Library

[14]

Navigli R., Martelli F., An overview of word and sense similarity, Nat. Lang. Eng. 25 (6) (2019) 693–714,.

[15]

Lastra-Díaz J.J., Goikoetxea J., Taieb M.A.H., García-Serrano A., Aouicha M.B., Agirre E., A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art, Eng. Appl. Artif. Intell. 85 (2019) 645–665,.

[16]

Cilibrasi R., Vitányi P.M.B., The google similarity distance, IEEE Trans. Knowl. Data Eng. 19 (3) (2007) 370–383,.

Digital Library

[17]

Resnik P., Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language, J. Artificial Intelligence Res. 11 (1999) 95–130,.

[18]

J.J. Jiang, D.W. Conrath, Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, in: Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997, Taipei, Taiwan, August 1997, 1997, pp. 19–33.

[19]

Ribón I.T., Vidal M., GARUM: A semantic similarity measure based on machine learning and entity characteristics, in: Hartmann S., Ma H., Hameurlain A., Pernul G., Wagner R.R. (Eds.), Database and Expert Systems Applications - 29th International Conference, DEXA 2018, Regensburg, Germany, September 3-6, 2018, Proceedings, Part I, in: Lecture Notes in Computer Science, 11029, Springer, 2018, pp. 169–183,.

Digital Library

[20]

Martinez-Gil J., Semantic similarity aggregators for very short textual expressions: a case study on landmarks and points of interest, J. Intell. Inf. Syst. 53 (2) (2019) 361–380,.

Digital Library

[21]

T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed Representations of Words and Phrases and their Compositionality, in: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a Meeting Held December 5-8, 2013, Lake Tahoe, Nevada, United States, 2013, pp. 3111–3119.

[22]

M. Faruqui, C. Dyer, Improving Vector Space Word Representations Using Multilingual Correlation, in: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26-30, 2014, Gothenburg, Sweden, 2014, pp. 462–471.

[23]

Martinez-Gil J., Accurate semantic similarity measurement of biomedical nomenclature by means of fuzzy logic, Int. J. Uncertain. Fuzziness Knowl. Based Syst. 24 (2) (2016) 291–306,.

[24]

Lee J., Yoon W., Kim S., Kim D., Kim S., So C.H., Kang J., BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics 36 (4) (2020) 1234–1240,.

[25]

Skrjanc I., Iglesias J.A., Sanchis A., Leite D.F., Lughofer E., Gomide F.A.C., Evolving fuzzy and neuro-fuzzy approaches in clustering, regression, identification, and classification: A survey, Inform. Sci. 490 (2019) 344–368,.

Digital Library

[26]

Nauck D.D., Kruse R., Neuro-fuzzy systems for function approximation, Fuzzy Sets and Systems 101 (2) (1999) 261–271,.

Digital Library

[27]

Bodenhofer U., Bauer P., A formal model of interpretability of linguistic variables, in: Interpretability Issues in Fuzzy Modeling, Springer, 2003, pp. 524–545.

[28]

Singh H., Lone Y.A., Deep Neuro-Fuzzy Systems with Python, Springer, 2020.

[29]

Huang B., Bai Y., Zhou X., Hub at SemEval-2021 task 2: Word meaning similarity prediction model based on RoBERTa and word frequency, in: Palmer A., Schneider N., Schluter N., Emerson G., Herbelot A., Zhu X. (Eds.), Proceedings of the 15th International Workshop on Semantic Evaluation, SemEval-ACL/IJCNLP 2021, Virtual Event / Bangkok, Thailand, August 5-6, 2021, Association for Computational Linguistics, 2021, pp. 719–723,.

[30]

Dai B., Li J., Xu R., Multiple positional self-attention network for text classification, in: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, the Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, the Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, AAAI Press, 2020, pp. 7610–7617. URL https://aaai.org/ojs/index.php/AAAI/article/view/6261.

[31]

Mamdani E.H., Assilian S., An experiment in linguistic synthesis with a fuzzy logic controller, Int. J. Hum.-Comput. Stud. 51 (2) (1999) 135–147,.

Digital Library

[32]

Alonso J.M., Castiello C., Mencar C., Interpretability of fuzzy systems: Current research trends and prospects, in: Springer Handbook of Computational Intelligence, Springer, 2015, pp. 219–237.

[33]

Martinez-Gil J., Chaves-Gonzalez J.M., Semantic similarity controllers: On the trade-off between accuracy and interpretability, Knowl.-Based Syst. (2021),.

Digital Library

[34]

Navarro-Almanza R., Sanchez M.A., Castro J.R., Mendoza O., Licea G., Interpretable mamdani neuro-fuzzy model through context awareness and linguistic adaptation, Expert Syst. Appl. 189 (2022),.

Digital Library

[35]

Cervantes J., Yu W., Salazar S., Chairez I., Takagi-Sugeno dynamic neuro-fuzzy controller of uncertain nonlinear systems, IEEE Trans. Fuzzy Syst. 25 (6) (2017) 1601–1615,.

Digital Library

[36]

Takagi T., Sugeno M., Fuzzy identification of systems and its applications to modeling and control, IEEE Trans. Syst. Man Cybern. 15 (1) (1985) 116–132,.

[37]

Cordón O., A historical review of evolutionary learning methods for Mamdani-type fuzzy rule-based systems: Designing interpretable genetic fuzzy systems, Internat. J. Approx. Reason. 52 (6) (2011) 894–913,.

Digital Library

[38]

Alonso J.M., Magdalena L., HILK++: an interpretability-guided fuzzy modeling methodology for learning readable and comprehensible fuzzy rule-based classifiers, Soft Comput. 15 (10) (2011) 1959–1980,.

Digital Library

[39]

Magdalena L., Fuzzy systems interpretability: What, why and how, in: Fuzzy Approaches for Soft Computing and Approximate Reasoning: Theories and Applications, Springer, 2020, pp. 111–122.

[40]

Angelov P.P., Buswell R.A., Automatic generation of fuzzy rule-based models from data by genetic algorithms, Inform. Sci. 150 (1–2) (2003) 17–31,.

Digital Library

[41]

Deb K., Agrawal S., Pratap A., Meyarivan T., A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans. Evol. Comput. 6 (2) (2002) 182–197,.

Digital Library

[42]

Miller G., Charles W., Contextual correlates of semantic similarity, Lang. Cogn. Process. 6 (1) (1991) 1–28.

[43]

Ballatore A., Bertolotto M., Wilson D.C., An evaluative baseline for geo-semantic relatedness and similarity, GeoInformatica 18 (4) (2014) 747–767,.

Digital Library

[44]

Deerwester S.C., Dumais S.T., Landauer T.K., Furnas G.W., Harshman R.A., Indexing by latent semantic analysis, J. Am. Soc. Inf. Sci. 41 (6) (1990) 391–407.

[45]

Cingolani P., Alcalá-Fdez J., JFuzzyLogic: a java library to design fuzzy logic controllers according to the standard for fuzzy control programming, Int. J. Comput. Intell. Syst. 6 (sup1) (2013) 61–75,.

[46]

E.H. Huang, R. Socher, C.D. Manning, A.Y. Ng, Improving Word Representations via Global Context and Multiple Word Prototypes, in: The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea - Volume 1: Long Papers, 2012, pp. 873–882.

[47]

Leacock C., Chodorow M., Combining local context and WordNet similarity for word sense identification, WordNet: Electron. Lex. Database 49 (2) (1998) 265–283.

[48]

D. Lin, An Information-Theoretic Definition of Similarity, in: Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Madison, Wisconsin, USA, July 24-27, 1998, 1998, pp. 296–304.

[49]

Martinez-Gil J., CoTO: A novel approach for fuzzy aggregation of semantic similarity measures, Cogn. Syst. Res. 40 (2016) 8–17,.

Digital Library

[50]

Martinez-Gil J., Chaves-Gonzalez J.M., Automatic design of semantic similarity controllers based on fuzzy logics, Expert Syst. Appl. 131 (2019) 45–59,.

Digital Library

[51]

Aouicha M.B., Taieb M.A.H., Hamadou A.B., LWCR: multi-layered wikipedia representation for computing word relatedness, Neurocomputing 216 (2016) 816–843,.

Digital Library

[52]

Bojanowski P., Grave E., Joulin A., Mikolov T., Enriching word vectors with subword information, Trans. Assoc. Comput. Linguist. 5 (2017) 135–146. URL https://transacl.org/ojs/index.php/tacl/article/view/999.

[53]

Rus V., Lintean M.C., Banjade R., Niraula N.B., Stefanescu D., SEMILAR: the semantic similarity toolkit, in: 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Proceedings of the Conference System Demonstrations, 4-9 August 2013, Sofia, Bulgaria, 2013, pp. 163–168. URL http://aclweb.org/anthology/P/P13/P13-4028.pdf.

[54]

Han L., Finin T., McNamee P., Joshi A., Yesha Y., Improving word similarity by augmenting PMI with estimates of word polysemy, IEEE Trans. Knowl. Data Eng. 25 (6) (2013) 1307–1322,.

Digital Library

[55]

Gabrilovich E., Markovitch S., Wikipedia-based semantic interpretation for natural language processing, J. Artificial Intelligence Res. 34 (2009) 443–498,.

Cited By

Zhao BZhang RBai K(2024)A Fuzzy Multigranularity Convolutional Neural Network With Double Attention Mechanisms for Measuring Semantic Textual SimilarityIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.342780132:10(5762-5776)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TFUZZ.2024.3427801

Index Terms

Neurofuzzy semantic similarity measurement
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
    2. Natural language processing
  2. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
    2. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Learning similarity with cosine similarity ensemble

This paper proposes a cosine similarity ensemble (CSE) method to learn similarity.CSE is a selective ensemble and combines multiple cosine similarity learners.A learner redefines the pattern vectors and determines its threshold adaptively.Experimental ...
Automatic design of semantic similarity controllers based on fuzzy logics
Highlights
- Introduction of the notion of semantic similarity controller.
- Design of ...
Abstract
Recent advances in machine learning have been able to make improvements over the state-of-the-art regarding semantic similarity measurement techniques. In fact, we have all seen how classical techniques have given way to promising ...
A Novel Neurofuzzy Approach for Semantic Similarity Measurement
Big Data Analytics and Knowledge Discovery
Abstract
The problem of identifying the degree of semantic similarity between two textual statements automatically has grown in importance in recent times. Its impact on various computer-related domains and recent breakthroughs in neural computation has ...

Comments

Information & Contributors

Information

Published In

cover image Data & Knowledge Engineering

Data & Knowledge Engineering Volume 145, Issue C

May 2023

534 pages

ISSN:0169-023X

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 May 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 31 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao BZhang RBai K(2024)A Fuzzy Multigranularity Convolutional Neural Network With Double Attention Mechanisms for Measuring Semantic Textual SimilarityIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.342780132:10(5762-5776)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TFUZZ.2024.3427801

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents