skip to main content
research-article
Public Access

SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

Published: 01 November 2018 Publication History

Abstract

Scientific discoveries are often driven by finding analogies in distant domains, but the growing number of papers makes it difficult to find relevant ideas in a single discipline, let alone distant analogies in other domains. To provide computational support for finding analogies across domains, we introduce SOLVENT, a mixed-initiative system where humans annotate aspects of research papers that denote their background (the high-level problems being addressed), purpose (the specific problems being addressed), mechanism (how they achieved their purpose), and findings (what they learned/achieved), and a computational model constructs a semantic representation from these annotations that can be used to find analogies among the research papers. We demonstrate that this system finds more analogies than baseline information-retrieval approaches; that annotators and annotations can generalize beyond domain; and that the resulting analogies found are useful to experts. These results demonstrate a novel path towards computationally supported knowledge sharing in research communities.

Supplementary Material

ZIP File (cscw031.zip)
Data and code for Study 1 and 3

References

[1]
Paul André, Haoqi Zhang, Juho Kim, Lydia Chilton, Steven P. Dow, and Robert C. Miller. 2013. Community clustering: Leveraging an academic crowd to form coherent conference sessions. In First AAAI Conference on Human Computation and Crowdsourcing.
[2]
Ryan Arlitt, Friederich Berthelsdorf, Sebastian Immel, and Robert B. Stone. 2014. The Biology Phenomenon Categorizer: A Human Computation Framework in Support of Biologically Inspired Design . Journal of Mechanical Design (2014).
[3]
Michele Banko, Michael J Cafarella, Stephen Soderland, Matthew Broadhead, and Oren Etzioni. 2007. Open Information Extraction from the Web. In IJCAI, Vol. 7. 2670--2676.
[4]
Abraham Bernstein, James Hendler, and Natalya Noy. 2016. A New Look at the Semantic Web . Commun. ACM, Vol. 59, 9 (Aug. 2016), 35--37.
[5]
Chandra Bhagavatula, Sergey Feldman, Russell Power, and Waleed Ammar. 2018. Content-based citation recommendation. arXiv preprint arXiv:1802.08301 (2018).
[6]
David M. Blei, Andrew Y. Ng, Michael I. Jordan, and John Lafferty. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research (2003), 993--1022.
[7]
Jonathan Bragg and Daniel S. Weld. 2013. Crowdsourcing Multi-Label Classification for Taxonomy Creation. In First AAAI Conference on Human Computation and Crowdsourcing.
[8]
Joseph C. Chang, Aniket Kittur, and Nathan Hahn. 2016. Alloy: Clustering with crowds and computation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM.
[9]
Lydia B. Chilton, Juho Kim, Paul André, Felicia Cordeiro, James A. Landay, Daniel S. Weld, Steven P. Dow, Robert C. Miller, and Haoqi Zhang. 2014. Frenzy: Collaborative Data Organization for Creating Conference Sessions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 1255--1264.
[10]
Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1999--2008.
[11]
Paolo Ciccarese, Elizabeth Wu, Gwen Wong, Marco Ocana, June Kinoshita, Alan Ruttenberg, and Tim Clark. 2008. The SWAN biomedical discourse ontology. Journal of Biomedical Informatics, Vol. 41, 5 (Oct. 2008), 739--751.
[12]
Tim Clark, Paolo N. Ciccarese, and Carole A. Goble. 2014. Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications. Journal of Biomedical Semantics, Vol. 5 (July 2014), 28.
[13]
Scott Deerwester, Susan T. Dumais, Geroge W. Furnas, and Thomas K. Landauer. 1990. Indexing by Latent Semantic Analysis. JASIST, Vol. 41, 6 (1990), 1990.
[14]
Brian Falkenhainer, Kenneth D Forbus, and Dedre Gentner. 1989. The structure-mapping engine: Algorithm and examples. Artificial intelligence, Vol. 41, 1 (1989), 1--63.
[15]
Dedre Gentner. 1983. Structure-Mapping: A Theoretical Framework for Analogy*. Cognitive science, Vol. 7, 2 (1983), 155--170.
[16]
M. L. Gick and K. J. Holyoak. 1983. Schema induction and analogical transfer. Cognitive Psychology, Vol. 15, 1 (1983), 1--38.
[17]
Karni Gilon, Joel Chan, Felicia Y Ng, Hila Lifshitz Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy Mining for Specific Design Needs . In Proceedings of the 2018 ACM SIGCHI Conference on Human Factors in Computing.
[18]
Nathan Hahn, Joseph Chang, Ji Eun Kim, and Aniket Kittur. 2016. The Knowledge Accelerator: Big Picture Thinking in Small Pieces. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 2258--2270.
[19]
Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, and Iryna Gurevych. 2017. Out-of-domain FrameNet Semantic Role Labeling. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Vol. 1. 471--482.
[20]
Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. 2010. Context-aware Citation Recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 421--430.
[21]
K. J. Holyoak and P. Thagard. 1996. The analogical scientist. In Mental Leaps: Analogy in Creative Thought, K. J. Holyoak and P. Thagard (Eds.). Cambridge, MA, 185--209.
[22]
Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating Innovation Through Analogy Mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 235--243.
[23]
John E Hummel and Keith J Holyoak. 2003. A symbolic-connectionist theory of relational inference and generalization. Psychological review, Vol. 110, 2 (2003), 220.
[24]
Benjamin F. Jones. 2009. The Burden of Knowledge and the Death of the Renaissance Man: Is Innovation Getting Harder? Review of Economic Studies, Vol. 76, 1 (2009), 283--317.
[25]
Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing Step-by-step Information Extraction to Enhance Existing How-to Videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 4017--4026.
[26]
Scott Kirkpatrick, C Daniel Gelatt, Mario P Vecchi, et almbox. 1983. Optimization by simulated annealing. science, Vol. 220, 4598 (1983), 671--680.
[27]
Maria Liakata, Shyamasree Saha, Simon Dobnik, Colin Batchelor, and Dietrich Rebholz-Schuhmann. 2012. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics, Vol. 28, 7 (April 2012), 991--1000.
[28]
Maria Liakata, Simone Teufel, Advaith Siddharthan, Colin R Batchelor, and others. 2010. Corpora for the Conceptualisation and Zoning of Scientific Papers. In LREC. Citeseer.
[29]
Yicong Liang, Qing Li, and Tieyun Qian. 2011. Finding Relevant Papers Based on Citation Relations. In Web-Age Information Management (Lecture Notes in Computer Science ). Springer, Berlin, Heidelberg, 403--414.
[30]
Angli Liu, Stephen Soderland, Jonathan Bragg, Christopher H Lin, Xiao Ling, and Daniel S Weld. 2016. Effective Crowd Annotation for Relation Extraction. In HLT-NAACL. 897--906.
[31]
Salvador E Luria and Max Delbrück. 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics, Vol. 28, 6 (1943), 491.
[32]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient Estimation of Word Representations in Vector Space . arXiv:1301.3781 {cs} (Jan. 2013). http://arxiv.org/abs/1301.3781 arXiv: 1301.3781.
[33]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed Representations of Words and Phrases and their Compositionality . In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3111--3119.
[34]
Tanushree Mitra, C.J. Hutto, and Eric Gilbert. 2015. Comparing Person- and Process-centric Strategies for Obtaining Quality Data on Amazon Mechanical Turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 1345--1354.
[35]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), Vol. 12 (2014), 1532--1543.
[36]
Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review, Vol. 106, 4 (1999), 643.
[37]
Xiang Ren, Jialu Liu, Xiao Yu, Urvashi Khandelwal, Quanquan Gu, Lidan Wang, and Jiawei Han. 2014. ClusCite: effective citation recommendation by information network-based clustering. In Knowledge Discovery and Data Mining. 821--830.
[38]
R. Keith Sawyer. 2012. Explaining creativity: the science of human innovation 2nd ed.). Oxford University Press, New York.
[39]
Aashish Sheshadri and Matthew Lease. 2013. SQUARE: A Benchmark for Research on Computing Crowd Consensus. In Proceedings of the 1st AAAI Conference on Human Computation (HCOMP). 156--164. http://ir.ischool.utexas.edu/square/documents/sheshadri.pdf
[40]
Pao Siangliulue, Joel Chan, Bernd Huber, Steven P. Dow, and Krzysztof Z. Gajos. 2016. IdeaHound: Self-sustainable Idea Generation in Creative Online Communities. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW '16 Companion ). ACM, New York, NY, USA, 98--101.
[41]
David W Stephens and John R Krebs. 1986. Foraging theory .Princeton University Press.
[42]
Trevor Strohman, W. Bruce Croft, and David Jensen. 2007. Recommending Citations for Academic Papers . In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07). ACM, New York, NY, USA, 705--706.
[43]
Yalin Sun, Pengxiang Cheng, Shengwei Wang, Hao Lyu, Matthew Lease, Iain Marshall, and Byron C. Wallace. 2016. Crowdsourcing Information Extraction for Biomedical Systematic Reviews. In 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track. http://arxiv.org/abs/1609.01017 3 pages. arXiv:1609.01017.
[44]
Swaroop Vattam, Bryan Wiltgen, Michael Helms, Ashok K. Goel, and Jeannette Yen. 2011. DANE: Fostering Creativity in and through Biologically Inspired Design . In Design Creativity 2010 . http://link.springer.com/chapter/10.1007/978-0--85729--224--7_16
[45]
S. Wuchty, B. F. Jones, and B. Uzzi. 2007. The increasing dominance of teams in production of knowledge. Science, Vol. 316, 5827 (2007), 1036--1039.
[46]
James Zou, Kamalika Chaudhuri, and Adam Kalai. 2015. Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons. In Third AAAI Conference on Human Computation and Crowdsourcing.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction
Proceedings of the ACM on Human-Computer Interaction  Volume 2, Issue CSCW
November 2018
4104 pages
EISSN:2573-0142
DOI:10.1145/3290265
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2018
Published in PACMHCI Volume 2, Issue CSCW

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. analogy
  2. computer-supported cooperative work
  3. crowdsourcing
  4. scientific discovery

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)355
  • Downloads (Last 6 weeks)38
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media