research-article

Public Access

SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

Authors:

Joseph Chee Chang,

Aniket KitturAuthors Info & Claims

Proceedings of the ACM on Human-Computer Interaction, Volume 2, Issue CSCW

Article No.: 31, Pages 1 - 21

https://doi.org/10.1145/3274300

Published: 01 November 2018 Publication History

Abstract

Scientific discoveries are often driven by finding analogies in distant domains, but the growing number of papers makes it difficult to find relevant ideas in a single discipline, let alone distant analogies in other domains. To provide computational support for finding analogies across domains, we introduce SOLVENT, a mixed-initiative system where humans annotate aspects of research papers that denote their background (the high-level problems being addressed), purpose (the specific problems being addressed), mechanism (how they achieved their purpose), and findings (what they learned/achieved), and a computational model constructs a semantic representation from these annotations that can be used to find analogies among the research papers. We demonstrate that this system finds more analogies than baseline information-retrieval approaches; that annotators and annotations can generalize beyond domain; and that the resulting analogies found are useful to experts. These results demonstrate a novel path towards computationally supported knowledge sharing in research communities.

Supplementary Material

ZIP File (cscw031.zip)

Data and code for Study 1 and 3

Download
6.80 MB

References

[1]

Paul André, Haoqi Zhang, Juho Kim, Lydia Chilton, Steven P. Dow, and Robert C. Miller. 2013. Community clustering: Leveraging an academic crowd to form coherent conference sessions. In First AAAI Conference on Human Computation and Crowdsourcing.

[2]

Ryan Arlitt, Friederich Berthelsdorf, Sebastian Immel, and Robert B. Stone. 2014. The Biology Phenomenon Categorizer: A Human Computation Framework in Support of Biologically Inspired Design . Journal of Mechanical Design (2014).

[3]

Michele Banko, Michael J Cafarella, Stephen Soderland, Matthew Broadhead, and Oren Etzioni. 2007. Open Information Extraction from the Web. In IJCAI, Vol. 7. 2670--2676.

Digital Library

[4]

Abraham Bernstein, James Hendler, and Natalya Noy. 2016. A New Look at the Semantic Web . Commun. ACM, Vol. 59, 9 (Aug. 2016), 35--37.

Digital Library

[5]

Chandra Bhagavatula, Sergey Feldman, Russell Power, and Waleed Ammar. 2018. Content-based citation recommendation. arXiv preprint arXiv:1802.08301 (2018).

[6]

David M. Blei, Andrew Y. Ng, Michael I. Jordan, and John Lafferty. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research (2003), 993--1022.

Digital Library

[7]

Jonathan Bragg and Daniel S. Weld. 2013. Crowdsourcing Multi-Label Classification for Taxonomy Creation. In First AAAI Conference on Human Computation and Crowdsourcing.

[8]

Joseph C. Chang, Aniket Kittur, and Nathan Hahn. 2016. Alloy: Clustering with crowds and computation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM.

Digital Library

[9]

Lydia B. Chilton, Juho Kim, Paul André, Felicia Cordeiro, James A. Landay, Daniel S. Weld, Steven P. Dow, Robert C. Miller, and Haoqi Zhang. 2014. Frenzy: Collaborative Data Organization for Creating Conference Sessions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 1255--1264.

Digital Library

[10]

Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1999--2008.

Digital Library

[11]

Paolo Ciccarese, Elizabeth Wu, Gwen Wong, Marco Ocana, June Kinoshita, Alan Ruttenberg, and Tim Clark. 2008. The SWAN biomedical discourse ontology. Journal of Biomedical Informatics, Vol. 41, 5 (Oct. 2008), 739--751.

Digital Library

[12]

Tim Clark, Paolo N. Ciccarese, and Carole A. Goble. 2014. Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications. Journal of Biomedical Semantics, Vol. 5 (July 2014), 28.

[13]

Scott Deerwester, Susan T. Dumais, Geroge W. Furnas, and Thomas K. Landauer. 1990. Indexing by Latent Semantic Analysis. JASIST, Vol. 41, 6 (1990), 1990.

[14]

Brian Falkenhainer, Kenneth D Forbus, and Dedre Gentner. 1989. The structure-mapping engine: Algorithm and examples. Artificial intelligence, Vol. 41, 1 (1989), 1--63.

Digital Library

[15]

Dedre Gentner. 1983. Structure-Mapping: A Theoretical Framework for Analogy*. Cognitive science, Vol. 7, 2 (1983), 155--170.

[16]

M. L. Gick and K. J. Holyoak. 1983. Schema induction and analogical transfer. Cognitive Psychology, Vol. 15, 1 (1983), 1--38.

[17]

Karni Gilon, Joel Chan, Felicia Y Ng, Hila Lifshitz Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy Mining for Specific Design Needs . In Proceedings of the 2018 ACM SIGCHI Conference on Human Factors in Computing.

Digital Library

[18]

Nathan Hahn, Joseph Chang, Ji Eun Kim, and Aniket Kittur. 2016. The Knowledge Accelerator: Big Picture Thinking in Small Pieces. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 2258--2270.

Digital Library

[19]

Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, and Iryna Gurevych. 2017. Out-of-domain FrameNet Semantic Role Labeling. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Vol. 1. 471--482.

[20]

Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. 2010. Context-aware Citation Recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 421--430.

Digital Library

[21]

K. J. Holyoak and P. Thagard. 1996. The analogical scientist. In Mental Leaps: Analogy in Creative Thought, K. J. Holyoak and P. Thagard (Eds.). Cambridge, MA, 185--209.

[22]

Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating Innovation Through Analogy Mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 235--243.

Digital Library

[23]

John E Hummel and Keith J Holyoak. 2003. A symbolic-connectionist theory of relational inference and generalization. Psychological review, Vol. 110, 2 (2003), 220.

[24]

Benjamin F. Jones. 2009. The Burden of Knowledge and the Death of the Renaissance Man: Is Innovation Getting Harder? Review of Economic Studies, Vol. 76, 1 (2009), 283--317.

[25]

Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing Step-by-step Information Extraction to Enhance Existing How-to Videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 4017--4026.

Digital Library

[26]

Scott Kirkpatrick, C Daniel Gelatt, Mario P Vecchi, et almbox. 1983. Optimization by simulated annealing. science, Vol. 220, 4598 (1983), 671--680.

[27]

Maria Liakata, Shyamasree Saha, Simon Dobnik, Colin Batchelor, and Dietrich Rebholz-Schuhmann. 2012. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics, Vol. 28, 7 (April 2012), 991--1000.

Digital Library

[28]

Maria Liakata, Simone Teufel, Advaith Siddharthan, Colin R Batchelor, and others. 2010. Corpora for the Conceptualisation and Zoning of Scientific Papers. In LREC. Citeseer.

[29]

Yicong Liang, Qing Li, and Tieyun Qian. 2011. Finding Relevant Papers Based on Citation Relations. In Web-Age Information Management (Lecture Notes in Computer Science ). Springer, Berlin, Heidelberg, 403--414.

Digital Library

[30]

Angli Liu, Stephen Soderland, Jonathan Bragg, Christopher H Lin, Xiao Ling, and Daniel S Weld. 2016. Effective Crowd Annotation for Relation Extraction. In HLT-NAACL. 897--906.

[31]

Salvador E Luria and Max Delbrück. 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics, Vol. 28, 6 (1943), 491.

[32]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient Estimation of Word Representations in Vector Space . arXiv:1301.3781 {cs} (Jan. 2013). http://arxiv.org/abs/1301.3781 arXiv: 1301.3781.

[33]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed Representations of Words and Phrases and their Compositionality . In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3111--3119.

Digital Library

[34]

Tanushree Mitra, C.J. Hutto, and Eric Gilbert. 2015. Comparing Person- and Process-centric Strategies for Obtaining Quality Data on Amazon Mechanical Turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 1345--1354.

Digital Library

[35]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), Vol. 12 (2014), 1532--1543.

[36]

Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review, Vol. 106, 4 (1999), 643.

[37]

Xiang Ren, Jialu Liu, Xiao Yu, Urvashi Khandelwal, Quanquan Gu, Lidan Wang, and Jiawei Han. 2014. ClusCite: effective citation recommendation by information network-based clustering. In Knowledge Discovery and Data Mining. 821--830.

Digital Library

[38]

R. Keith Sawyer. 2012. Explaining creativity: the science of human innovation 2nd ed.). Oxford University Press, New York.

[39]

Aashish Sheshadri and Matthew Lease. 2013. SQUARE: A Benchmark for Research on Computing Crowd Consensus. In Proceedings of the 1st AAAI Conference on Human Computation (HCOMP). 156--164. http://ir.ischool.utexas.edu/square/documents/sheshadri.pdf

[40]

Pao Siangliulue, Joel Chan, Bernd Huber, Steven P. Dow, and Krzysztof Z. Gajos. 2016. IdeaHound: Self-sustainable Idea Generation in Creative Online Communities. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW '16 Companion ). ACM, New York, NY, USA, 98--101.

Digital Library

[41]

David W Stephens and John R Krebs. 1986. Foraging theory .Princeton University Press.

[42]

Trevor Strohman, W. Bruce Croft, and David Jensen. 2007. Recommending Citations for Academic Papers . In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07). ACM, New York, NY, USA, 705--706.

Digital Library

[43]

Yalin Sun, Pengxiang Cheng, Shengwei Wang, Hao Lyu, Matthew Lease, Iain Marshall, and Byron C. Wallace. 2016. Crowdsourcing Information Extraction for Biomedical Systematic Reviews. In 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track. http://arxiv.org/abs/1609.01017 3 pages. arXiv:1609.01017.

[44]

Swaroop Vattam, Bryan Wiltgen, Michael Helms, Ashok K. Goel, and Jeannette Yen. 2011. DANE: Fostering Creativity in and through Biologically Inspired Design . In Design Creativity 2010 . http://link.springer.com/chapter/10.1007/978-0--85729--224--7_16

[45]

S. Wuchty, B. F. Jones, and B. Uzzi. 2007. The increasing dominance of teams in production of knowledge. Science, Vol. 316, 5827 (2007), 1036--1039.

[46]

James Zou, Kamalika Chaudhuri, and Adam Kalai. 2015. Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons. In Third AAAI Conference on Human Computation and Crowdsourcing.

Cited By

Long KLi STang JWang T(2025)Leveraging multiple control codes for aspect-controllable related paper recommendationInformation Processing & Management10.1016/j.ipm.2024.10387962:1(103879)Online publication date: Jan-2025
https://doi.org/10.1016/j.ipm.2024.103879
Fok RChang JAugust TZhang AWeld D(2024)Qlarify: Recursively Expandable Abstracts for Dynamic Information Retrieval over Scientific PapersProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676397(1-21)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676397
Zheng CZhang YHuang ZShi CXu MMa X(2024)DiscipLink: Unfolding Interdisciplinary Information Seeking Process via Human-AI Co-ExplorationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676366(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676366
Show More Cited By

Index Terms

SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing theory, concepts and paradigms
      1. Computer supported cooperative work
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Document topic models

Recommendations

Surface Name Errors in Wikipedia
CODS-COMAD '23: Proceedings of the 6th Joint International Conference on Data Science & Management of Data (10th ACM IKDD CODS and 28th COMAD)

Surface name is the string used to refer to an entity in a text corpus. Crowd-sourced knowledge repositories such as Wikipedia can have multiple types of errors, including surface name errors. This paper focuses on identifying and correcting surface ...
Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles

Massive increases in electronically available text have spurred a variety of natural language processing methods to automatically identify relationships from text; however, existing annotated collections comprise only bioinformatics (gene-protein) or ...
YaLi: a crowdsourcing plug-in for NERD
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

We demonstrate the YaLi browser plug-in which discovers named entities in Web pages and provides background knowledge about them. The plug-in is implemented with two purposes. From a user perspective, it enriches the browsing experience with entities, ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction

Proceedings of the ACM on Human-Computer Interaction Volume 2, Issue CSCW

November 2018

4104 pages

EISSN:2573-0142

DOI:10.1145/3290265

Editors:
Karrie Karahalios
University of Illinois & Adobe
,
Andrés Monroy-Hernández
Snap Inc.
,
Airi Lampinen
Stockholm University
,
Geraldine Fitzpatrick
Vienna University of Technology

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2018

Published in PACMHCI Volume 2, Issue CSCW

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

45
Total Citations
View Citations
1,937
Total Downloads

Downloads (Last 12 months)355
Downloads (Last 6 weeks)38

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Long KLi STang JWang T(2025)Leveraging multiple control codes for aspect-controllable related paper recommendationInformation Processing & Management10.1016/j.ipm.2024.10387962:1(103879)Online publication date: Jan-2025
https://doi.org/10.1016/j.ipm.2024.103879
Fok RChang JAugust TZhang AWeld D(2024)Qlarify: Recursively Expandable Abstracts for Dynamic Information Retrieval over Scientific PapersProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676397(1-21)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676397
Zheng CZhang YHuang ZShi CXu MMa X(2024)DiscipLink: Unfolding Interdisciplinary Information Seeking Process via Human-AI Co-ExplorationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676366(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676366
Emerson AEndow STorres C(2024)Anther: Cross-Pollinating Communities of Practice via Video TutorialsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660727(1991-2005)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3660727
Yen YE JJin HLi MLin GPan IDow S(2024)ProcessGallery: Contrasting Early and Late Iterations for Design Principle LearningProceedings of the ACM on Human-Computer Interaction10.1145/36373898:CSCW1(1-35)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637389
Mukhopadhyay AVenkatagiri SLuther K(2024)OSINT Research Studios: A Flexible Crowdsourcing Framework to Scale Up Open Source Intelligence InvestigationsProceedings of the ACM on Human-Computer Interaction10.1145/36373828:CSCW1(1-38)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637382
Radensky M(2024)Mixed-Initiative Methods for Co-Creation in Scientific ResearchProceedings of the 16th Conference on Creativity & Cognition10.1145/3635636.3664627(1-7)Online publication date: 23-Jun-2024
https://dl.acm.org/doi/10.1145/3635636.3664627
Srinivasan AChan J(2024)Improving Selection of Analogical Inspirations through Chunking and RecombinationProceedings of the 16th Conference on Creativity & Cognition10.1145/3635636.3656207(374-397)Online publication date: 23-Jun-2024
https://dl.acm.org/doi/10.1145/3635636.3656207
He ZHuang CDing CRohatgi SHuang T(2024)If in a Crowdsourced Data Annotation Pipeline, a GPT-4Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642834(1-25)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642834
Lee YKang HLatzke MKim JBragg JChang JSiangliulue P(2024)PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected PapersProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642196(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642196
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents