skip to main content
research-article
Public Access

Mining for Topics to Suggest Knowledge Model Extensions

Published: 03 December 2016 Publication History

Abstract

Electronic concept maps, interlinked with other concept maps and multimedia resources, can provide rich knowledge models to capture and share human knowledge. This article presents and evaluates methods to support experts as they extend existing knowledge models, by suggesting new context-relevant topics mined from Web search engines. The task of generating topics to support knowledge model extension raises two research questions: first, how to extract topic descriptors and discriminators from concept maps; and second, how to use these topic descriptors and discriminators to identify candidate topics on the Web with the right balance of novelty and relevance. To address these questions, this article first develops the theoretical framework required for a “topic suggester” to aid information search in the context of a knowledge model under construction. It then presents and evaluates algorithms based on this framework and applied in Extender, an implemented tool for topic suggestion. Extender has been developed and tested within CmapTools, a widely used system for supporting knowledge modeling using concept maps. However, the generality of the algorithms makes them applicable to a broad class of knowledge modeling systems, and to Web search in general.

References

[1]
Andy Aiken and Derek Sleeman. 2003. Refiner++: A knowledge acquisition and refinement tool. In Proceedings of the KCAP Workshop on Capturing Knowledge from Domain Experts: Progress 8 Prospects (KCAP’03), Derek Sleeman and Yolanda Gil (Eds.). Retrieved from http://www.csd.abdn.ac.uk/∼sleeman/published-papers/p141.pdf.
[2]
Russ B. Altman, Michael Bada, Xiaoqian J. Chai, Michelle Whirl Carillo, Richard O. Chen, and Neil F. Abernethy. 1999. RiboWeb: An ontology-based system for collaborative molecular biology. IEEE Intelligent Systems 14, 5 (1999), 68--76.
[3]
Giambattista Amati. 2003. Probability Models for Information Retrieval based on Divergence from Randomness. Ph.D. Dissertation. University of Glasgow.
[4]
Patrick Arnold and Erhard Rahm. 2015. Automatic extraction of semantic relations from Wikipedia. International Journal on Artificial Intelligence Tools 24, 2 (2015), 1540010.
[5]
Josianne Basque, Clément Imbeault, Béatrice Pudelko, and Michel Léonard. 2004. Collaborative knowledge modeling between experts and novices: A strategy to support transfer of expertise in an organization. In Concept Maps: Theory, Methodology, Technology. Proceedings of the 1st International Conference on Concept Mapping, A. J. Cañas, J. D. Novak, and F. González (Eds.). Universidad Pública de Navarra, 75--81.
[6]
Jim Blythe, Jihie Kim, Surya Ramachandran, and Yolanda Gil. 2001. An integrated environment for knowledge acquisition. In Proceedings of the 6th International Conference on Intelligent User Interfaces. ACM, 13--20.
[7]
Geoffrey Briggs, David Shamma, Alberto Cañas, Roger Carff, Jeffrey Scargle, and Joseph Novak. 2004. Concept maps applied to Mars exploration public outreach. In Concept Maps: Theory, Methodology, Technology. Proceedings of the 1st International Conference on Concept Mapping, A. J. Cañas, J. D. Novak, and F. González (Eds.). Universidad Pública de Navarra, 125--133.
[8]
Jay Budzik, Kristian J. Hammond, and Larry Birnbaum. 2001. Information access in context. Knowledge-Based Systems 14 (2001), 37--53.
[9]
Karla L. Caballero Barajas and Ram Akella. 2013. Incorporating statistical topic models in the retrieval of health care documents. In Proceedings of the ShARe/CLEF eHealth Evaluation Lab. ELRA. Retrieved from http://clefpackages.elra.info/clefehealthtask3/workingnotes/CLEFeHealth2013_Lab_ Working_Notes/TASK_3/CLEF2013wn-CLEFeHealth-CaballeroEt2013.pdf.
[10]
Osvaldo Cairó and Silvia Guardati. 2012. The KAMET II methodology: Knowledge acquisition, knowledge modeling and knowledge generation. Expert Systems with Applications 39, 9 (2012), 8108--8114.
[11]
Alberto Cañas, David Leake, and Ana Maguitman. 2001. Combining concept mapping with CBR: Experience-based support for knowledge modeling. In Proceedings of the 14th International Florida Artificial Intelligence Research Society Conference. AAAI Press, 286--290.
[12]
Alberto J. Cañas, John Coffey, Thomas Reichherzer, Greg Hill, Niranjan Suri, Roger Carff, Tim Mitrovich, and Derek Eberle. 1998. El-tech: A performance support system with embedded training for electronics technicians. In Proceedings of the 11th International Florida Artificial Intelligence Research Society Conference. AAAI Press, 79--83.
[13]
Alberto J. Cañas, Greg Hill, Roger Carff, Niranjan Suri, James Lott, Tom Eskridge, Gloria Gómez, Mario Arroyo, and Rodrigo Carvajal. 2004. CmapTools: A knowledge modeling and sharing environment. In Concept Maps: Theory, Methodology, Technology. Proceedings of the 1st International Conference on Concept Mapping, A. J. Cañas, J. D. Novak, and F. González (Eds.). Universidad Pública de Navarra, 125--133.
[14]
Alberto J. Cañas and Joseph D. Novak. 2014. Concept mapping using CmapTools to enhance meaningful learning. In Knowledge Cartography. Springer, 23--45.
[15]
Alberto J. Cañas, Joseph D. Novak, and Jacqueline Vanhear. 2012. Concept Maps: Theory, Methodology, Technology. Proceedings of the 5th International Conference on Concept Mapping. Veritas Press.
[16]
Charles F. Cannell, Peter V. Miller, and Lois Oksenberg. 1981. Research on interviewing techniques. Sociological Methodology 12, 4 (1981), 389--437.
[17]
Claudio Carpineto, Stanislaw Osiński, Giovanni Romano, and Dawid Weiss. 2009. A survey of web clustering engines. Computing Surveys 41, 3 (2009), 17.
[18]
Claudio Carpineto and Giovanni Romano. 2012. A survey of automatic query expansion in information retrieval. Computing Surveys 44, 1 (2012), 1--50.
[19]
Rocío L. Cecchini, Carlos M. Lorenzetti, Ana G. Maguitman, and Filippo Menczer. 2011. A semantic framework for evaluating topical search methods. CLEI Electronic Journal 14, 1 (April 2011). Retrieved from http://www.clei.cl/cleiej/paper.php?id=211.
[20]
John Coffey, Thomas Reichherzer, Bernd Owsnicki-Klewe, and Norman Wilde. 2012. Automated concept map generation from services-oriented architecture artifacts. In Concept Maps: Theory, Methodology, Technology. Proceedings of the 5th International Conference on Concept Mapping. University of Malta, Valleta, Malta.
[21]
John W. Coffey. 1999. Institutional Memory Preservation at NASA Glenn Research Center. Unpublished technical report. NASA Glenn Research Center, Cleveland, OH.
[22]
Paulo R. M. Correia, Maria E. I. Malachias, Alberto J. Cañas, and Joseph D. Novak. 2014. Concept Maps: Theory, Methodology, Technology. Proceedings of the 6th International Conference on Concept Mapping. Universidade de Sao Pãulo.
[23]
John Davies, Alistair Duke, and York Sure. 2003. OntoShare: A knowledge management environment for virtual communities of practice. In Proceedings of the International Conference on Knowledge Capture. ACM, 20--27.
[24]
Inderjit S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 269--274.
[25]
Tom Eskridge and Robert Hoffman. 2012. Ontology creation as a sensemaking activity. IEEE Intelligent Systems 27, 5 (Sep. 2012), 58--65.
[26]
Adam Farquhar, Richard Fikes, and James Rice. 1997. The Ontolingua server: A tool for collaborative ontology construction. International Journal of Human-Computer Studies 46, 6 (1997), 707--727.
[27]
Paolo Ferragina and Antonio Gulli. 2008. A personalized search engine based on Web-snippet hierarchical clustering. Software: Practice and Experience 38, 2 (2008), 189--225.
[28]
Lee A. Freeman and Leonard M. Jessup. 2004. The power and benefits of concept mapping: Measuring use, usefulness, ease of use, and satisfaction. International Journal of Science Education 26 (2004), 151--169.
[29]
Ariel Fuxman, Patrick Pantel, Yuanhua Lv, Ashok Chandra, Pradeep Chilakamarri, Michael Gamon, David Hamilton, Bernhard Kohlmeier, Dhyanesh Narayanan, Evangelos Papalexakis, and Bo Zhao. 2014. Contextual insights. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion (WWW Companion’14). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 265--266.
[30]
Byron J. Gao, David C. Anastasiu, and Xing Jiang. 2010. Utilizing user-input contextual terms for query disambiguation. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING’10). Association for Computational Linguistics, Stroudsburg, PA, 329--337.
[31]
Alexander Garcia Castro, Philippe Rocca-Serra, Robert Stevens, Chris Taylor, Karim Nashar, Mark Ragan, and Susanna-Assunta Sansone. 2006. The use of concept maps during knowledge elicitation in ontology development processes -- The nutrigenomics use case. BMC Bioinformatics 7, 267 (May 2006), 1--14.
[32]
Richard J. Gil and Maria J. Martin-Bautista. 2012. A novel integrated knowledge support system based on ontology learning: Model specification and a case study. Knowledge-Based Systems 36 (2012), 340--352.
[33]
Yolanda Gil. 1994. Knowledge refinement in a reflective architecture. In Proceedings of the 12th National Conference on Artificial Intelligence. AAAI Press.
[34]
Thomas R. Gruber. 1993. Towards principles for the design of ontologies used for knowledge sharing. In Formal Ontology in Conceptual Analysis and Knowledge Representation, N. Guarino and R. Poli (Eds.). Kluwer Academic Publishers, Deventer, The Netherlands.
[35]
David Gunning, Vinay K. Chaudhri, Peter E. Clark, Ken Barker, Shaw-Yi Chaw, Mark Greaves, Benjamin Grosof, Alice Leung, David D. McDonald, Sunil Mishra, and others. 2010. Project Halo update—Progress toward digital Aristotle. AI Magazine 31, 3 (2010), 33--58.
[36]
Frederick Hayes-Roth, Donald A. Waterman, and Douglas B. Lenat. 1983. Building Expert Systems. Addison-Wesley.
[37]
Robert R. Hoffman, John W. Coffey, Kenneth M. Ford, and Mary Jo Carnot. 2001. Storm-LK: A human-centered knowledge model for weather forecasting. In Proceedings of the 45th Annual Meeting of the Human Factors and Ergonomics Society.
[38]
Katja Hofmann, Anne Schuth, Alejandro Bellogin Kouki, and Maarten de Rijke. 2014. User behavior and bias in click-based recommender evaluation. In Proceedings of European Conference on Information Retrieval (ECIR’14), Lecture Notes in Computer Science. Springer.
[39]
Shang-Hsien Hsieh, Hsien-Tang Lin, Nai-Wen Chi, Kuang-Wu Chou, and Ken-Yu Lin. 2011. Enabling the development of base domain ontology through extraction of knowledge from engineering domain handbooks. Advanced Engineering Informatics 25, 2 (2011), 288--296.
[40]
Ellen M. Hufnagel and Christopher Conca. 1994. User response data: The potential for errors and biases. Information Systems Research 5, 1 (1994), 48--73.
[41]
Sparck K. Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28 (1972), 11--21.
[42]
Leonard Kaufman and Peter J. Rousseeuw. 1989. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
[43]
Reiner Kraft. 2011. Search systems and methods using in-line contextual queries. US Patent 7,958,115.
[44]
Reiner Kraft, Farzin Maghoul, Chi Chao Chang, and Ravi Kumar. 2006. Searching with context. In Proceedings of the 15th International World Wide Web Conference (WWW’15).
[45]
David Leake, Ana Maguitman, and Thomas Reichherzer. 2004. Understanding knowledge models: Modeling assessment of concept importance in concept maps. In Proceedings of CogSci-2004.
[46]
David Leake, Ana Maguitman, and Thomas Reichherzer. 2014. Experience-based support for human-centered knowledge modeling. Knowledge-Based Systems 68 (2014), 77--87.
[47]
David Leake, Ana Maguitman, Thomas Reichherzer, Alberto Cañas, Marco Carvalho, Marco Arguedas, Sofia Brenes, and Tom Eskridge. 2003. Aiding knowledge capture by searching for extensions of knowledge models. In Proceedings of International Conference on Knowledge Capture (KCAP’03). ACM.
[48]
David Leake and David Wilson. 2001. A case-based framework for interactive capture and reuse of design knowledge. Applied Intelligence 14 (2001), 77--94.
[49]
Avishay Livne, Vivek Gokuladas, Jaime Teevan, Susan T. Dumais, and Eytan Adar. 2014. CiteSight: Supporting contextual citation recommendation using differential search. In Proceedings of the 37th International ACM SIGIR Conference on Research 8 Development in Information Retrieval (SIGIR’14). ACM, 807--816.
[50]
Carlos M. Lorenzetti and Ana G. Maguitman. 2009. A semi-supervised incremental algorithm to automatically formulate topical queries. Information Sciences 179, 12 (2009), 1881--1892. (Including Special Issue on Web Search).
[51]
Ana Maguitman, David Leake, Thomas Reichherzer, and Filippo Menczer. 2004. Dynamic extraction of topic descriptors and discriminators: Towards automatic context-based topic search. In Proceedings of the 13th Conference on Information and Knowledge Management (CIKM). ACM, New York, 463--472.
[52]
Ana G. Maguitman, Filippo Menczer, Heather Roinestad, and Alessandro Vespignani. 2005. Algorithmic detection of semantic similarity. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 107--116.
[53]
John W. Mohr and Petko Bogdanov. 2013. Introduction—Topic models: What they are and why they matter. Poetics 41, 6 (2013), 545--569.
[54]
Brian M. Moon, Robert R. Hoffman, Joseph D. Novak, and Alberto J. Cañas. 2011. Applied Concept Mapping: Capturing, Analyzing and Organizing Knowledge. CRC Press, New York.
[55]
Joseph Novak. 1977. A Theory of Education. Cornell University Press, Ithaca, IL.
[56]
Joseph Novak and Alberto Cañas. 2008. The Theory Underlying Concept Maps and How To Construct Them. Technical Report. Florida Institute for Human and Machine Cognition. Retrieved from http://cmap.ihmc.us/Publications/ResearchPapers/TheoryUnderlyingConceptMaps.pdf.
[57]
Joseph Novak and D. Bob Gowin. 1984. Learning How to Learn. Cambridge University Press.
[58]
Natalya Fridman Noy, Ray W. Fergerson, and Mark A. Musen. 2000. The knowledge model of Protégé-2000: Combining interoperability and flexibility. In Proceedings of European Workshop on Knowledge Acquisition, Modeling and Management (EKAW’00).
[59]
Iadh Ounis, Christina Lioma, Craig Macdonald, and Vassilis Plachouras. 2007. Research directions in Terrier: A search engine for advanced retrieval on the web. Novatica/UPGRADE Special Issue on Web Information Access, Ricardo Baeza-Yates et al. (Eds.), Invited Paper VIII, 1 (Feb. 2007), 49--56.
[60]
Cosimo Palmisano, Alexander Tuzhilin, and Michele Gorgoglione. 2008. Using context to improve predictive modeling of customers in personalization applications. IEEE Transactions on Knowledge and Data Engineering 20, 11 (2008), 1535--1549.
[61]
Alexander Panchenko, Sergey Adeykin, Alexey Romanov, and Pavel Romanov. 2012. Extraction of semantic relations between concepts with KNN algorithms on Wikipedia. In Concept Discovery in Unstructured Data Workshop (CDUD) of International Conference on Formal Concept Analysis. 78--88.
[62]
Kyung-Wha Park, Byoung-Hee Kim, Tae-Suh Park, and Byoung-Tak Zhang. 2014. Uncovering response biases in recommendation. In Workshops at the 28th AAAI Conference on Artificial Intelligence. Multidisciplinary Workshop on Advances in Preference Handling. AAAI Press.
[63]
Thomas Reichherzer and David Leake. 2006a. Towards automatic support for augmenting concept maps with documents. In Proceedings of the 2nd International Conference on Concept Mapping.
[64]
Thomas Reichherzer and David Leake. 2006b. Understanding the role of structure in concept maps. In Proceedings of the 28th Annual Conference of the Cognitive Science Society. 2004--2009.
[65]
Bradley J. Rhodes and Thad Starner. 1996. The remembrance agent: A continuously running automated information retrieval system. In Proceedings of the 1st International Conference on the Practical Application of Intelligent Agents and Multi Agent Technology (PAAM’96). AAAI Press, 487--495.
[66]
Maria Ruiz-Casado, Enrique Alfonseca, and Pablo Castells. 2005. Automatic extraction of semantic relationships for WordNet by means of pattern learning from Wikipedia. In Natural Language Processing and Information Systems. Springer, 67--79.
[67]
Gerard Salton. 1971. The SMART Retrieval System -- Experiments in Automatic Document Processing. Prentice Hall.
[68]
Gerard Salton. 1979. Mathematics and information retrieval. Journal of Documentation 35, 1 (1979), 1--29.
[69]
Gerard Salton and Michael E. Lesk. 1968. Computer evaluation of indexing and text processing. Journal of the ACM 15, 1 (Jan. 1968), 8--36.
[70]
Gerard Salton and Chung-Shu Yang. 1973. On the specification of term values in automatic indexing. Journal of Documentation 29 (1973), 351--372.
[71]
Heru Agus Santoso, Su-Cheng Haw, and Ziyad T. Abdul-Mehdi. 2011. Ontology extraction from relational database: Concept hierarchy as background knowledge. Knowledge-Based Systems 24, 3 (2011), 457--464.
[72]
Beat A. Schwendimann. 2014. Making sense of knowledge integration maps. In Digital Knowledge Maps in Education. Springer, 17--40.
[73]
Nigel Shadbolt and Paul R. Smart. 2015. Knowledge elicitation: Methods, tools and techniques. In Evaluation of Human Work (4th ed.), John R. Wilson and Sarah Sharples (Eds.). CRC Press.
[74]
Rushdi Shams and Adel Elsayed. 2008. Development of a conceptual structure for a domain-specific corpus. In Proceedings of the 3rd International Conference on Concept Mapping, Tallinn, Estonia 8 Helsinki, Finland, 2008.
[75]
Robin Sibson. 1973. SLINK: An optimally efficient algorithm for the single-link cluster method. Computer Journal 16, 1 (1973), 30--34.
[76]
Elena Simperl and Markus Luczak-Rösch. 2014. Collaborative ontology engineering: A survey. The Knowledge Engineering Review 29, 1 (2014), 101--131.
[77]
Dallas Snider, John Coffey, Thomas Reichherzer, Norman Wilde, Joe Vandeville, Northrop Grumman, Allison Heinen, and Sarah Pramanik. 2014. Using concept maps to introduce software security assurance cases. CrossTalk 27, 5 (2014), 4--9.
[78]
Qiuxia Song, Jin Liu, Ming Ni, Liang Chen, and Jialiang Shen. 2014. Sorting topic specific web pages based on ontology knowledge. In Proceedings of the 10th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP’14). IEEE, 880--883.
[79]
Steffen Staaba, Jürgen Angeleb, Stefan Deckera, Michael Erdmanna, Andreas Hothoa, Alexander Maedchea, Hans-Peter Schnurra, Rudi Studera, and York Surea. 2000. AI for the web -- Ontology-based community web portals. In Proceedings of the National Conference on Artificial Intelligence (AAAI’00). MIT Press, Menlo Park.
[80]
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2008. YAGO: A large ontology from Wikipedia and WordNet. Web Semantics: Science, Services and Agents on the World Wide Web 6, 3 (2008), 203--217.
[81]
Quan Wang, Jun Xu, Hang Li, and Nick Craswell. 2013. Regularized latent semantic indexing: A new approach to large-scale topic modeling. ACM Transactions on Information Systems 31, 1 (Jan. 2013), 5:1--5:44.
[82]
Gerhard Weikum and Martin Theobald. 2010. From information to knowledge: Harvesting entities and relationships from web sources. In Proceedings of the 29th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’10). ACM, New York, NY, 65--76.
[83]
S. K. Michael Wong, Wojciech Ziarko, Vijay V. Raghavan, and P. C. N. Wong. 1987. On modeling of information retrieval concepts in vector spaces. ACM Transactions on Database Systems 12, 2 (Jun. 1987), 299--321.
[84]
Clement T. Yu, K. Lam, and Gerard Salton. 1982. Term weighting in information retrieval using the term precision model. Journal of the ACM 29, 1 (Jan. 1982), 152--170.
[85]
Oren Zamir and Oren Etzioni. 1999. Grouper: A dynamic clustering interface to web search results. Computer Networks 31, 11--16 (1999), 1361--1374.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 11, Issue 2
May 2017
419 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3017677
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2016
Accepted: 01 September 2016
Revised: 01 May 2016
Received: 01 April 2015
Published in TKDD Volume 11, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Concept mapping
  2. intelligent suggesters
  3. knowledge construction
  4. knowledge discovery
  5. web mining

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Universidad Nacional del Sur
  • NASA
  • CONICET
  • MinCyT

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)128
  • Downloads (Last 6 weeks)11
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media