Measuring sentence-level and aspect-level (un) certainty in science communications

J Pei, D Jurgens - arXiv preprint arXiv:2109.14776, 2021 - arxiv.org
arXiv preprint arXiv:2109.14776, 2021arxiv.org
Certainty and uncertainty are fundamental to science communication. Hedges have widely
been used as proxies for uncertainty. However, certainty is a complex construct, with authors
expressing not only the degree but the type and aspects of uncertainty in order to give the
reader a certain impression of what is known. Here, we introduce a new study of certainty
that models both the level and the aspects of certainty in scientific findings. Using a new
dataset of 2167 annotated scientific findings, we demonstrate that hedges alone account for …
Certainty and uncertainty are fundamental to science communication. Hedges have widely been used as proxies for uncertainty. However, certainty is a complex construct, with authors expressing not only the degree but the type and aspects of uncertainty in order to give the reader a certain impression of what is known. Here, we introduce a new study of certainty that models both the level and the aspects of certainty in scientific findings. Using a new dataset of 2167 annotated scientific findings, we demonstrate that hedges alone account for only a partial explanation of certainty. We show that both the overall certainty and individual aspects can be predicted with pre-trained language models, providing a more complete picture of the author's intended communication. Downstream analyses on 431K scientific findings from news and scientific abstracts demonstrate that modeling sentence-level and aspect-level certainty is meaningful for areas like science communication. Both the model and datasets used in this paper are released at https://blablablab.si.umich.edu/projects/certainty/.
arxiv.org