Robert Jäschke


2025

pdf bib
A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity Detection
Simon Hachmeier | Robert Jäschke
Proceedings of the 31st International Conference on Computational Linguistics

Detecting music entities such as song titles or artist names is a useful application to help use cases like processing music search queries or analyzing music consumption on the web. Recent approaches incorporate smaller language models (SLMs) like BERT and achieve high results. However, further research indicates a high influence of entity exposure during pre-training on the performance of the models. With the advent of large language models (LLMs), these outperform SLMs in a variety of downstream tasks. However, researchers are still divided if this is applicable to tasks like entity detection in texts due to issues like hallucination. In this paper, we provide a novel dataset of user-generated metadata and conduct a benchmark and a robustness study using recent LLMs with in-context-learning (ICL). Our results indicate that LLMs in the ICL setting yield higher performance than SLMs. We further uncover the large impact of entity exposure on the best performing LLM in our study.

2024

pdf bib
Information Extraction of Music Entities in Conversational Music Queries
Simon Hachmeier | Robert Jäschke
Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA)

The detection of music entities such as songs or performing artists in natural language queries is an important task when designing conversational music recommendation agents. Previous research has observed the applicability of named entity recognition approaches for this task based on pre-trained encoders like BERT. In recent years, large language models (LLMs) have surpassed these encoders in a variety of downstream tasks. In this paper, we validate the use of LLMs for information extraction of music entities in conversational queries by few-shot prompting. We test different numbers of examples and compare two sampling methods to obtain few-shot examples. Our results indicate that LLM performance can achieve state-of-the-art performance in the task.

pdf bib
Leveraging User-Generated Metadata of Online Videos for Cover Song Identification
Simon Hachmeier | Robert Jäschke
Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA)

YouTube is a rich source of cover songs. Since the platform itself is organized in terms of videos rather than songs, the retrieval of covers is not trivial. The field of cover song identification addresses this problem and provides approaches that usually rely on audio content. However, including the user-generated video metadata available on YouTube promises improved identification results. In this paper, we propose a multi-modal approach for cover song identification on online video platforms. We combine the entity resolution models with audio-based approaches using a ranking model. Our findings implicate that leveraging user-generated metadata can stabilize cover song identification performance on YouTube.

2023

pdf bib
Japan’s Answer to Mozart”: Automatic Detection of Generalized Patterns of Vossian Antonomasia
Michel Schwab | Robert Jäschke | Frank Fischer
Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)

pdf bib
“Who is the Madonna of Italian-American Literature?”: Target Entity Extraction and Analysis of Vossian Antonomasia
Michel Schwab | Robert Jäschke | Frank Fischer
Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

In this paper, we present approaches for the automated extraction and disambiguation of a part of the stylistic device Vossian Antonomasia (VA), namely the target entity that is described by the expression. We model the problem as a coreference resolution task and a question answering task and also combine both tasks. To tackle these tasks, we utilize state-of-the-art models in these areas. In addition, we visualize the connection between the source and target entities of VA in a web demo to get a deeper understanding of the interaction of entities used in VA expressions.

2022

pdf bib
“Der Frank Sinatra der Wettervorhersage”: Cross-Lingual Vossian Antonomasia Extraction
Michel Schwab | Robert Jäschke | Frank Fischer
Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)

2021

pdf bib
Lotte and Annette: A Framework for Finding and Exploring Key Passages in Literary Works
Frederik Arnold | Robert Jäschke
Proceedings of the Workshop on Natural Language Processing for Digital Humanities

We present an approach that leverages expert knowledge contained in scholarly works to automatically identify key passages in literary works. Specifically, we extend a text reuse detection method for finding quotations, such that our system Lotte can deal with common properties of quotations, for example, ellipses or inaccurate quotations. An evaluation shows that Lotte outperforms four existing approaches. To generate key passages, we combine overlapping quotations from multiple scholarly texts. An interactive website, called Annette, for visualizing and exploring key passages makes the results accessible and explorable.

2019

pdf bib
“A Buster Keaton of Linguistics”: First Automated Approaches for the Extraction of Vossian Antonomasia
Michel Schwab | Robert Jäschke | Frank Fischer | Jannik Strötgen
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Attributing a particular property to a person by naming another person, who is typically wellknown for the respective property, is called a Vossian Antonomasia (VA). This subtpye of metonymy, which overlaps with metaphor, has a specific syntax and is especially frequent in journalistic texts. While identifying Vossian Antonomasia is of particular interest in the study of stylistics, it is also a source of errors in relation and fact extraction as an explicitly mentioned entity occurs only metaphorically and should not be associated with respective contexts. Despite rather simple syntactic variations, the automatic extraction of VA was never addressed as yet since it requires a deeper semantic understanding of mentioned entities and underlying relations. In this paper, we propose a first method for the extraction of VAs that works completely automatically. Our approaches use named entity recognition, distant supervision based on Wikidata, and a bi-directional LSTM for postprocessing. The evaluation on 1.8 million articles of the New York Times corpus shows that our approach significantly outperforms the only existing semi-automatic approach for VA identification by more than 30 percentage points in precision.

2015

pdf bib
Semantic Annotation for Microblog Topics Using Wikipedia Temporal Information
Tuan Tran | Nam Khanh Tran | Asmelash Teka Hadgu | Robert Jäschke
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing