1 Introduction
Analogical reasoning has been central to creative problem solving throughout the history of science and technology [
32,
43,
50,
54,
60,
86]. Many important scientific discoveries were driven by analogies: the Greek philosopher Chrysippus made a connection between observable water waves and sound waves; an analogy between bacteria and slot machines helped Salvador Luria advance the theory of bacterial mutation; a pioneering chemist Joseph Priestly suggested charges attract or repel each other with an inverse square force by an analogy to gravity.
Today the potential for finding analogies to accelerate innovation in science and engineering is greater than ever before. As of 2009 fifty million scientific articles had been published, and the number continues to grow at an exceedingly fast rate [
12,
28,
68,
85]. These articles represent a potential treasure trove for finding inspirations from distant domains and generating creative solutions to challenging problems.
However, searching analogical inspirations in a large corpus of articles remains a longstanding challenge [
34,
44,
83,
99]. Previous systems for retrieving analogies have largely focused on modeling analogical relations in non-scientific domains and/or in limited scopes (e.g., structure-mapping [
36,
37,
38,
42,
106], multiconstraint-based [
33,
59,
65], connectionist [
57], rule-based reasoning [
3,
15,
16,
110] systems), and the prohibitive costs of creating highly structured representations prevented hand-crafted systems (e.g., DANE [
65,
109]) from having a broad coverage of topics and being deployed for realistic use. Conversely, scalable computational approaches such as keyword or citation based search engines have been limited by a dependence on surface or domain similarity. Such search engines aim at maximizing similarity to the query which is useful when trying to know what has been done on the problem in the target domain but less useful when trying to find inspiration outside that domain (for example, for Salvador Luria’s queries: “how do bacteria mutate?” or “why are bacterial mutation rates so inconsistent?”, similarity maximizing search engines may have found Luria and Delbrück’s earlier work on E.coli [
81] but may have failed to recognize more distant sources of inspiration such as slot machines as relevant).
Recently a novel idea for analogical search was introduced [
61]. In this idea what would otherwise be a complex analogical relation between products is pared down to just two components: purpose (
what problem does it solve?) and mechanism (
how does it solve that problem?). Once many such purpose and mechanism pairs are identified, products that solve a similar problem to the query but using diverse mechanisms are searched to help broaden the searcher’s perspective on the problem and boost their creativity for coming up with novel mechanism ideas. Anecdotal evidence suggests that this approach may also be applicable to the domain of scientific research. For example, while building lighter and more compact solar panel arrays has been a longstanding challenge for NASA scientists, recognizing how the ancient art form of origami may be applied to create folding structures led to an innovation to use compliant mechanisms to build not just compact but also self-deployable solar arrays [
27,
89,
118] (diagrammatically shown in Figure
1). The first remaining challenge of analogical search in the scholarly domain is how we might represent scientific articles as purpose and mechanism pairs at scale and search for those that solve similar purposes using different mechanisms. Recent advances in natural language processing have demonstrated that neural networks that use pre-trained embeddings to encode input text can offer a promising technique to address it. Pre-trained embeddings are real-valued vectors that represent tokens (
Tokenization means breaking a piece of text into smaller units;
Tokens can be words, characters, sub-words, or n-grams.), in a high-dimensional space (e.g., typically dimensions of a few dozens to a few thousands) and are shown to capture rich, multifaceted semantic relations between words [
8,
100]. Leveraging them, neural networks may be trained to identify purposes and mechanisms from text [
61,
62] to enable search-by-analogy (i.e., different mechanisms used for similar purposes). Once candidate articles are retrieved, searchers may use them to come up with novel classes of mechanisms or apply them directly to their own research problems to improve upon the current state. Prior studies in product ideation showed that users of analogical search systems could engage with the results to engender more novel and relevant ideas [
21,
48,
74]. Here, we study the remaining open questions as to whether such findings also generalize to the scientific domains of innovation and how they may differ.
In this article, we present a functioning prototype of an analogical search engine for scientific articles at scale and investigate how such a system can help users explore and adapt distant inspirations. In doing so our system moves beyond manually curated approaches that have limited data (e.g., crowdsourced annotations in [
21] with
\(\sim\)2,000 articles) and machine learning approaches that have been limited to simple product descriptions [
48,
61,
62]. Using the prototypical system, we explore how it enables scientists to interactively search for inspirations for their personalized research problems in a large (
\(\sim\)1.7 M) article corpus. We investigate whether scientists can recognize mapping of analogical relations between the results returned from our analogical search engine and their query problems, and use them to come up with novel ideas. The scale of our corpus allows us to probe realistic issues including noise, error, and scale as well as how scientists react to a search engine that does not aim at providing only the most similar results to their query.
In order to accomplish these goals we describe how we address several technical issues in the design of an interactive-speed analogical search engine, ranging from developing a machine learning model for extracting purposes and mechanisms in scientific text at a token level granularity, the pipeline for constructing a similarity space of purpose embeddings, and enabling these embeddings to be queried at interactive speeds by end users through a search interface. We construct the similarity space by putting semantically related purpose embeddings in close indices from each other such that related purposes can be searched at scale.
In addition to the technical challenges there are several important questions around the design of analogical search engines that we explore here. A core conceptual difference that distinguishes analogical search engines from other kinds is that the analogs they find for a search query need to maintain some kind of distance from the query, rather than simply maximizing the similarity with it. However, only certain kinds of distance may support generative ideation while others have a detrimental effect. Another question remains as to how much distance is appropriate when it comes to finding analogical inspirations in other domains. While landmark studies of analogical innovation suggest that highly distant domains can provide particularly novel or transformative innovations [
46,
47,
55], recent work suggests the question may be more nuanced and that intermediate levels of distance may be fruitful for finding ideas that are close enough to be relevant but sufficiently distant to be unfamiliar and spur creative adaptation [
22,
39,
49]. Using a concrete example from one of our participants who studied ways to facilitate heat transfer in semiconductors, a keyword search engine might find commonly used mechanisms appropriate for direct application (e.g., tweaking the composition of the material) while an analogical search engine might find similar problems in more distant domains which suggest mechanisms that inspire creative adaptation (e.g., nanoscale fins that absorb heat and convert it to mechanical energy). Though more distant conceptual combinations may not always lead to immediately feasible or useful ideas, they may result in outsized value after being iterated on [
9,
23,
75].
In the following sections, we explore the technical and design challenges for an analogical search engine and how users interact with such a system. First, we describe the development of a human-in-the-loop search engine prototype, in which most elements of the system are functional but human screeners are used to remove obvious noise from the end results in order to maximize our ability to probe how users interact with potentially useful analogical inspirations. Using this prototype we characterize how researchers searching for inspirations for their own problems gain the most benefit from articles that partially match their problem (i.e., match at a high level purpose but mismatch at a lower level specifications of the purpose), and that the benefits are driven not by direct application of the ideas in the article but by creative adaptation of those ideas to their target domain. Subsequently we describe improvements to the system to enable a fully automated, interactive-speed prototype and case studies with researchers using the system in a realistic way involving reformulation of their queries and self-driven attention to the results. We synthesize the findings of the two studies into design implications for next-generation analogical search engines.
Through extensive in-depth evaluations using an ideation think-aloud protocol [
35,
107] with PhD-level researchers working on their own problems, we evaluate the degree to which inspirations spark creative adaptation ideas in a realistic way on scientists’ own research problems. Unlike previous work which has often used undergraduate students in the classroom or lab [
109], and often evaluated systems on predetermined problems [
40], this study design provides our evaluation with a high degree of external validity and allows us to deeply understand the ways in which encountering our results can engender new ideas. Our final, automated search engine demonstrates how the human-in-the-loop filtering can be removed while achieving a similar accuracy. We conclude with the benefits, design challenges, and opportunities for future analogical search engines from case studies with several researchers. To encourage innovation in this domain, we release our corpus of purpose and mechanism embeddings.
1A Reproducibility
Training and validation datasets. The original annotation dataset from [
21] also includes Background and Findings annotations which we exclude due to their relatively high confusion rates among the annotators with the Purpose and Mechanism classes and to balance the number of available training examples per annotation class.
Model parameter selection. We experimented with changing the model capacity relative to the signal present in the training dataset by tuning the number of hidden layers and the nodes used in each model architecture. For Model 1, we found a hidden layer of 100 nodes was sufficient. We optimized this model using the cross-entropy loss and the Adam optimizer [
73] with a 0.0001 learning rate. For Model 2, we found three hidden layers with 256 nodes led to an improved accuracy on the validation set. We trained this model with an L2 regularizer (
\(\alpha = 0.01\)), dropouts with the rate of 0.3, and the Adam optimizer with a 0.001 learning rate.
Span-based model architecture. We adapt SpanRel [
67] as architecture for the span-based Model 2. SpanRel combines the boundary representation (BiLSTM) and the content representation with a self-attention mechanism for finding the core words. More specifically, given a sentence
\(\mathbf {x}\) \(=\) \([e_1,\) \(e_2,\) \(\ldots ,\) \(e_n]\), of
n token embeddings, a span
\(s_i = [\omega _{s_i}, \omega _{s_i + 1}, \ldots , \omega _{f_i}]\) is a concatenation of the
content representation \({\mathbf {z_i}}^c\) (weighted average across all token embeddings in the span; SelfAttn) and the
boundary representation \({\mathbf {z_i}}^b\) of the start (
\(s_i\)) and end positions (
\(f_i\)) of the span:
We use the contextualized ELMo 5.5B embeddings
15 for token representation, following the near state-of-the-art performance reported on the named entity recognition task on the Wet Lab Protocol dataset in [
67]. We refer to [
67,
79] for further details.
Other parameters. We use GloVe vectors for input feature representation for Model 1 with 300 dimensions, consistent with the prior work [
11,
78,
88]. For Model 2, we use the contextualized ELMo 5.5B embeddings as described above which have pre-determined 1,024 dimensions. We use USE [
20] for encoding purposes. A USE embedding vector has pre-determined 512 dimensions.