skip to main content
research-article

Neurofuzzy semantic similarity measurement

Published: 01 May 2023 Publication History

Abstract

Automatically identifying the degree of semantic similarity between two small pieces of text has grown in importance recently. Its impact on various computer-related domains and recent breakthroughs in neural computation has increased the opportunities for better solutions to be developed. This work contributes a neurofuzzy approach for semantic textual similarity that uses neural networks and fuzzy logics. The idea is to combine the capabilities of the deep neural models for working with text with the ones from fuzzy logic for aggregating numerical data. The results of our experiments suggest that such an approach can accurately determine semantic similarity.

References

[1]
Harispe S., Ranwez S., Janaqi S., Montmain J., Semantic Similarity from Natural Language and Ontology Analysis, in: Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, 2015,.
[2]
Lastra-Díaz J.J., García-Serrano A., Batet M., Fernández M., Chirigati F., HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset, Inf. Syst. 66 (2017) 97–118,.
[3]
Palma G., Vidal M., Haag E., Raschid L., Thor A., Determining similarity of scientific entities in annotation datasets, Database J. Biol. Databases Curation 2015 (2015),.
[4]
Majumder G., Pakray P., Das R., Pinto D., Interpretable semantic textual similarity of sentences using alignment of chunks with classification and regression, Appl. Intell. 51 (10) (2021) 7322–7349,.
[5]
Paul C., Rettinger A., Mogadala A., Knoblock C.A., Szekely P.A., Efficient graph-based document similarity, in: Sack H., Blomqvist E., d’Aquin M., Ghidini C., Ponzetto S.P., Lange C. (Eds.), The Semantic Web. Latest Advances and New Domains - 13th International Conference, ESWC 2016, Heraklion, Crete, Greece, May 29 - June 2, 2016, Proceedings, in: Lecture Notes in Computer Science, 9678, Springer, 2016, pp. 334–349,.
[6]
Cer D., Yang Y., Kong S., Hua N., Limtiaco N., John R.S., Constant N., Guajardo-Cespedes M., Yuan S., Tar C., Strope B., Kurzweil R., Universal sentence encoder for english, in: Blanco E., Lu W. (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018: System Demonstrations, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, 2018, pp. 169–174,.
[7]
Devlin J., Chang M.-W., Lee K., Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, 2018, arXiv preprint arXiv:1810.04805.
[8]
M.E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep contextualized word representations, in: Proceedings of NAACL-HLT, 2018, pp. 2227–2237.
[9]
Martinez-Gil J., Chaves-Gonzalez J.M., A novel method based on symbolic regression for interpretable semantic similarity measurement, Expert Syst. Appl. 160 (2020),.
[10]
de Campos Souza P.V., Fuzzy neural networks and neuro-fuzzy networks: A review the main techniques and applications used in the literature, Appl. Soft Comput. 92 (2020),.
[11]
Rutkowski L., Cpalka K., Flexible neuro-fuzzy systems, IEEE Trans. Neural Netw. 14 (3) (2003) 554–574,.
[12]
Martinez-Gil J., Mokadem R., Küng J., Hameurlain A., A novel neurofuzzy approach for semantic similarity measurement, in: Golfarelli M., Wrembel R., Kotsis G., Tjoa A.M., Khalil I. (Eds.), Big Data Analytics and Knowledge Discovery - 23rd International Conference, DaWaK 2021, Virtual Event, September 27-30, 2021, Proceedings, in: Lecture Notes in Computer Science, 12925, Springer, 2021, pp. 192–203,.
[13]
Li Y., Bandar Z., McLean D., An approach for measuring semantic similarity between words using multiple information sources, IEEE Trans. Knowl. Data Eng. 15 (4) (2003) 871–882,.
[14]
Navigli R., Martelli F., An overview of word and sense similarity, Nat. Lang. Eng. 25 (6) (2019) 693–714,.
[15]
Lastra-Díaz J.J., Goikoetxea J., Taieb M.A.H., García-Serrano A., Aouicha M.B., Agirre E., A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art, Eng. Appl. Artif. Intell. 85 (2019) 645–665,.
[16]
Cilibrasi R., Vitányi P.M.B., The google similarity distance, IEEE Trans. Knowl. Data Eng. 19 (3) (2007) 370–383,.
[17]
Resnik P., Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language, J. Artificial Intelligence Res. 11 (1999) 95–130,.
[18]
J.J. Jiang, D.W. Conrath, Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, in: Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997, Taipei, Taiwan, August 1997, 1997, pp. 19–33.
[19]
Ribón I.T., Vidal M., GARUM: A semantic similarity measure based on machine learning and entity characteristics, in: Hartmann S., Ma H., Hameurlain A., Pernul G., Wagner R.R. (Eds.), Database and Expert Systems Applications - 29th International Conference, DEXA 2018, Regensburg, Germany, September 3-6, 2018, Proceedings, Part I, in: Lecture Notes in Computer Science, 11029, Springer, 2018, pp. 169–183,.
[20]
Martinez-Gil J., Semantic similarity aggregators for very short textual expressions: a case study on landmarks and points of interest, J. Intell. Inf. Syst. 53 (2) (2019) 361–380,.
[21]
T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed Representations of Words and Phrases and their Compositionality, in: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a Meeting Held December 5-8, 2013, Lake Tahoe, Nevada, United States, 2013, pp. 3111–3119.
[22]
M. Faruqui, C. Dyer, Improving Vector Space Word Representations Using Multilingual Correlation, in: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26-30, 2014, Gothenburg, Sweden, 2014, pp. 462–471.
[23]
Martinez-Gil J., Accurate semantic similarity measurement of biomedical nomenclature by means of fuzzy logic, Int. J. Uncertain. Fuzziness Knowl. Based Syst. 24 (2) (2016) 291–306,.
[24]
Lee J., Yoon W., Kim S., Kim D., Kim S., So C.H., Kang J., BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics 36 (4) (2020) 1234–1240,.
[25]
Skrjanc I., Iglesias J.A., Sanchis A., Leite D.F., Lughofer E., Gomide F.A.C., Evolving fuzzy and neuro-fuzzy approaches in clustering, regression, identification, and classification: A survey, Inform. Sci. 490 (2019) 344–368,.
[26]
Nauck D.D., Kruse R., Neuro-fuzzy systems for function approximation, Fuzzy Sets and Systems 101 (2) (1999) 261–271,.
[27]
Bodenhofer U., Bauer P., A formal model of interpretability of linguistic variables, in: Interpretability Issues in Fuzzy Modeling, Springer, 2003, pp. 524–545.
[28]
Singh H., Lone Y.A., Deep Neuro-Fuzzy Systems with Python, Springer, 2020.
[29]
Huang B., Bai Y., Zhou X., Hub at SemEval-2021 task 2: Word meaning similarity prediction model based on RoBERTa and word frequency, in: Palmer A., Schneider N., Schluter N., Emerson G., Herbelot A., Zhu X. (Eds.), Proceedings of the 15th International Workshop on Semantic Evaluation, SemEval-ACL/IJCNLP 2021, Virtual Event / Bangkok, Thailand, August 5-6, 2021, Association for Computational Linguistics, 2021, pp. 719–723,.
[30]
Dai B., Li J., Xu R., Multiple positional self-attention network for text classification, in: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, the Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, the Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, AAAI Press, 2020, pp. 7610–7617. URL https://aaai.org/ojs/index.php/AAAI/article/view/6261.
[31]
Mamdani E.H., Assilian S., An experiment in linguistic synthesis with a fuzzy logic controller, Int. J. Hum.-Comput. Stud. 51 (2) (1999) 135–147,.
[32]
Alonso J.M., Castiello C., Mencar C., Interpretability of fuzzy systems: Current research trends and prospects, in: Springer Handbook of Computational Intelligence, Springer, 2015, pp. 219–237.
[33]
Martinez-Gil J., Chaves-Gonzalez J.M., Semantic similarity controllers: On the trade-off between accuracy and interpretability, Knowl.-Based Syst. (2021),.
[34]
Navarro-Almanza R., Sanchez M.A., Castro J.R., Mendoza O., Licea G., Interpretable mamdani neuro-fuzzy model through context awareness and linguistic adaptation, Expert Syst. Appl. 189 (2022),.
[35]
Cervantes J., Yu W., Salazar S., Chairez I., Takagi-Sugeno dynamic neuro-fuzzy controller of uncertain nonlinear systems, IEEE Trans. Fuzzy Syst. 25 (6) (2017) 1601–1615,.
[36]
Takagi T., Sugeno M., Fuzzy identification of systems and its applications to modeling and control, IEEE Trans. Syst. Man Cybern. 15 (1) (1985) 116–132,.
[37]
Cordón O., A historical review of evolutionary learning methods for Mamdani-type fuzzy rule-based systems: Designing interpretable genetic fuzzy systems, Internat. J. Approx. Reason. 52 (6) (2011) 894–913,.
[38]
Alonso J.M., Magdalena L., HILK++: an interpretability-guided fuzzy modeling methodology for learning readable and comprehensible fuzzy rule-based classifiers, Soft Comput. 15 (10) (2011) 1959–1980,.
[39]
Magdalena L., Fuzzy systems interpretability: What, why and how, in: Fuzzy Approaches for Soft Computing and Approximate Reasoning: Theories and Applications, Springer, 2020, pp. 111–122.
[40]
Angelov P.P., Buswell R.A., Automatic generation of fuzzy rule-based models from data by genetic algorithms, Inform. Sci. 150 (1–2) (2003) 17–31,.
[41]
Deb K., Agrawal S., Pratap A., Meyarivan T., A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans. Evol. Comput. 6 (2) (2002) 182–197,.
[42]
Miller G., Charles W., Contextual correlates of semantic similarity, Lang. Cogn. Process. 6 (1) (1991) 1–28.
[43]
Ballatore A., Bertolotto M., Wilson D.C., An evaluative baseline for geo-semantic relatedness and similarity, GeoInformatica 18 (4) (2014) 747–767,.
[44]
Deerwester S.C., Dumais S.T., Landauer T.K., Furnas G.W., Harshman R.A., Indexing by latent semantic analysis, J. Am. Soc. Inf. Sci. 41 (6) (1990) 391–407.
[45]
Cingolani P., Alcalá-Fdez J., JFuzzyLogic: a java library to design fuzzy logic controllers according to the standard for fuzzy control programming, Int. J. Comput. Intell. Syst. 6 (sup1) (2013) 61–75,.
[46]
E.H. Huang, R. Socher, C.D. Manning, A.Y. Ng, Improving Word Representations via Global Context and Multiple Word Prototypes, in: The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea - Volume 1: Long Papers, 2012, pp. 873–882.
[47]
Leacock C., Chodorow M., Combining local context and WordNet similarity for word sense identification, WordNet: Electron. Lex. Database 49 (2) (1998) 265–283.
[48]
D. Lin, An Information-Theoretic Definition of Similarity, in: Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Madison, Wisconsin, USA, July 24-27, 1998, 1998, pp. 296–304.
[49]
Martinez-Gil J., CoTO: A novel approach for fuzzy aggregation of semantic similarity measures, Cogn. Syst. Res. 40 (2016) 8–17,.
[50]
Martinez-Gil J., Chaves-Gonzalez J.M., Automatic design of semantic similarity controllers based on fuzzy logics, Expert Syst. Appl. 131 (2019) 45–59,.
[51]
Aouicha M.B., Taieb M.A.H., Hamadou A.B., LWCR: multi-layered wikipedia representation for computing word relatedness, Neurocomputing 216 (2016) 816–843,.
[52]
Bojanowski P., Grave E., Joulin A., Mikolov T., Enriching word vectors with subword information, Trans. Assoc. Comput. Linguist. 5 (2017) 135–146. URL https://transacl.org/ojs/index.php/tacl/article/view/999.
[53]
Rus V., Lintean M.C., Banjade R., Niraula N.B., Stefanescu D., SEMILAR: the semantic similarity toolkit, in: 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Proceedings of the Conference System Demonstrations, 4-9 August 2013, Sofia, Bulgaria, 2013, pp. 163–168. URL http://aclweb.org/anthology/P/P13/P13-4028.pdf.
[54]
Han L., Finin T., McNamee P., Joshi A., Yesha Y., Improving word similarity by augmenting PMI with estimates of word polysemy, IEEE Trans. Knowl. Data Eng. 25 (6) (2013) 1307–1322,.
[55]
Gabrilovich E., Markovitch S., Wikipedia-based semantic interpretation for natural language processing, J. Artificial Intelligence Res. 34 (2009) 443–498,.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Data & Knowledge Engineering
Data & Knowledge Engineering  Volume 145, Issue C
May 2023
534 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 May 2023

Author Tags

  1. Knowledge engineering
  2. Similarity learning
  3. Semantic similarity measurement

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media