Paul Groth

2024

pdf bib abs
SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection
Bradley Allen | Fina Polat | Paul Groth
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

We describe the University of Amsterdam Intelligent Data Engineering Lab team’s entry for the SemEval-2024 Task 6 competition. The SHROOM-INDElab system builds on previous work on using prompt programming and in-context learning with large language models (LLMs) to build classifiers for hallucination detection, and extends that work through the incorporation of context-specific definition of task, role, and target concept, and automated generation of examples for use in a few-shot prompting approach. The resulting system achieved fourth-best and sixth-best performance in the model-agnostic track and model-aware tracks for Task 6, respectively, and evaluation using the validation sets showed that the system’s classification decisions were consistent with those of the crowdsourced human labelers. We further found that a zero-shot approach provided better accuracy than a few-shot approach using automatically generated examples. Code for the system described in this paper is available on Github.

2023

pdf bib
Improving Graph-to-Text Generation Using Cycle Training
Fina Polat | Ilaria Tiddi | Paul Groth | Piek Vossen
Proceedings of the 4th Conference on Language, Data and Knowledge

2022

pdf bib abs
SlotGAN: Detecting Mentions in Text via Adversarial Distant Learning
Daniel Daza | Michael Cochez | Paul Groth
Proceedings of the Sixth Workshop on Structured Prediction for NLP

We present SlotGAN, a framework for training a mention detection model that only requires unlabeled text and a gazetteer. It consists of a generator trained to extract spans from an input sentence, and a discriminator trained to determine whether a span comes from the generator, or from the gazetteer. We evaluate the method on English newswire data and compare it against supervised, weakly-supervised, and unsupervised methods. We find that the performance of the method is lower than these baselines, because it tends to generate more and longer spans, and in some cases it relies only on capitalization. In other cases, it generates spans that are valid but differ from the benchmark. When evaluated with metrics based on overlap, we find that SlotGAN performs within 95% of the precision of a supervised method, and 84% of its recall. Our results suggest that the model can generate spans that overlap well, but an additional filtering mechanism is required.

2021

pdf bib abs
SemEval-2021 Task 8: MeasEval – Extracting Counts and Measurements and their Related Contexts
Corey Harper | Jessica Cox | Curt Kohler | Antony Scerri | Ron Daniel Jr. | Paul Groth
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

We describe MeasEval, a SemEval task of extracting counts, measurements, and related context from scientific documents, which is of significant importance to the creation of Knowledge Graphs that distill information from the scientific literature. This is a new task in 2021, for which over 75 submissions from 25 participants were received. We expect the data developed for this task and the findings reported to be valuable to the scientific knowledge extraction, metrology, and automated knowledge base construction communities.

2020

pdf bib abs
Effective distributed representations for academic expert search
Mark Berger | Jakub Zavrel | Paul Groth
Proceedings of the First Workshop on Scholarly Document Processing

Expert search aims to find and rank experts based on a user’s query. In academia, retrieving experts is an efficient way to navigate through a large amount of academic knowledge. Here, we study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval. We use the Microsoft Academic Graph dataset and experiment with different configurations of a document-centric voting model for retrieval. In particular, we explore the impact of the use of contextualized embeddings on search performance. We also present results for paper embeddings that incorporate citation information through retrofitting. Additionally, experiments are conducted using different techniques for assigning author weights based on author order. We observe that using contextual embeddings produced by a transformer model trained for sentence similarity tasks produces the most effective paper representations for document-centric expert retrieval. However, retrofitting the paper embeddings and using elaborate author contribution weighting strategies did not improve retrieval performance.

pdf bib abs
Towards Entity Spaces
Marieke van Erp | Paul Groth
Proceedings of the Twelfth Language Resources and Evaluation Conference

Entities are a central element of knowledge bases and are important input to many knowledge-centric tasks including text analysis. For example, they allow us to find documents relevant to a specific entity irrespective of the underlying syntactic expression within a document. However, the entities that are commonly represented in knowledge bases are often a simplification of what is truly being referred to in text. For example, in a knowledge base, we may have an entity for Germany as a country but not for the more fuzzy concept of Germany that covers notions of German Population, German Drivers, and the German Government. Inspired by recent advances in contextual word embeddings, we introduce the concept of entity spaces - specific representations of a set of associated entities with near-identity. Thus, these entity spaces provide a handle to an amorphous grouping of entities. We developed a proof-of-concept for English showing how, through the introduction of entity spaces in the form of disambiguation pages, the recall of entity linking can be improved.

pdf bib abs
Towards Olfactory Information Extraction from Text: A Case Study on Detecting Smell Experiences in Novels
Ryan Brate | Paul Groth | Marieke van Erp
Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Environmental factors determine the smells we perceive, but societal factors factors shape the importance, sentiment and biases we give to them. Descriptions of smells in text, or as we call them ‘smell experiences’, offer a window into these factors, but they must first be identified. To the best of our knowledge, no tool exists to extract references to smell experiences from text. In this paper, we present two variations on a semi-supervised approach to identify smell experiences in English literature. The combined set of patterns from both implementations offer significantly better performance than a keyword-based baseline.

2018

pdf bib abs
Open Information Extraction on Scientific Text: An Evaluation
Paul Groth | Mike Lauruhn | Antony Scerri | Ron Daniel Jr.
Proceedings of the 27th International Conference on Computational Linguistics

Open Information Extraction (OIE) is the task of the unsupervised creation of structured information from text. OIE is often used as a starting point for a number of downstream tasks including knowledge base construction, relation extraction, and question answering. While OIE methods are targeted at being domain independent, they have been evaluated primarily on newspaper, encyclopedic or general web text. In this article, we evaluate the performance of OIE on scientific texts originating from 10 different disciplines. To do so, we use two state-of-the-art OIE systems using a crowd-sourcing approach. We find that OIE systems perform significantly worse on scientific text than encyclopedic text. We also provide an error analysis and suggest areas of work to reduce errors. Our corpus of sentences and judgments are made available.