Search | arXiv e-print repository

Gate-defined flat-band charge carrier confinement in twisted bilayer graphene

Authors: Alexander Rothstein, Ammon Fischer, Anthony Achtermann, Eike Icking, Katrin Hecker, Luca Banszerus, Martin Otto, Stefan Trellenkamp, Florian Lentz, Kenji Watanabe, Takashi Taniguchi, Bernd Beschoten, Robin J. Dolleman, Dante M. Kennes, Christoph Stampfer

Abstract: Twisted bilayer graphene (tBLG) near the magic angle is an interesting platform to study correlated electronic phases. These phases are gate-tunable and are closely related to the presence of flat electronic bands, isolated by single-particle band gaps. This allows electrostatically controlled confinement of charge carriers in the flat bands to explore the interplay between confinement, band renor… ▽ More Twisted bilayer graphene (tBLG) near the magic angle is an interesting platform to study correlated electronic phases. These phases are gate-tunable and are closely related to the presence of flat electronic bands, isolated by single-particle band gaps. This allows electrostatically controlled confinement of charge carriers in the flat bands to explore the interplay between confinement, band renormalisation, electron-electron interactions and the moiré superlattice, potentially revealing key mechanisms underlying these electronic phases. Here, we show gate-controlled flat-band charge carrier confinement in near-magic-angle tBLG, resulting in well-tunable Coulomb blockade resonances arising from the charging of electrostatically defined islands in tBLG. Coulomb resonance measurements allow to study magnetic field-induced quantum oscillations in the density of states of the source-drain reservoirs, providing insight into the gate-tunable Fermi surfaces of tBLG. Comparison with tight-binding calculations emphasises the importance of displacement-field-induced band renormalisation, which is crucial for future advanced gate-tunable quantum devices and circuits in tBLG. △ Less

Submitted 12 September, 2024; originally announced September 2024.

Comments: 25 pages, 14 figures

arXiv:2409.01813 [pdf, other]

Reassessing Noise Augmentation Methods in the Context of Adversarial Speech

Authors: Karla Pizzi, Matías P. Pizarro B, Asja Fischer

Abstract: In this study, we investigate if noise-augmented training can concurrently improve adversarial robustness in automatic speech recognition (ASR) systems. We conduct a comparative analysis of the adversarial robustness of four different state-of-the-art ASR architectures, where each of the ASR architectures is trained under three different augmentation conditions: one subject to background noise, sp… ▽ More In this study, we investigate if noise-augmented training can concurrently improve adversarial robustness in automatic speech recognition (ASR) systems. We conduct a comparative analysis of the adversarial robustness of four different state-of-the-art ASR architectures, where each of the ASR architectures is trained under three different augmentation conditions: one subject to background noise, speed variations, and reverberations, another subject to speed variations only, and a third without any form of data augmentation. The results demonstrate that noise augmentation not only improves model performance on noisy speech but also the model's robustness to adversarial attacks. △ Less

Submitted 3 September, 2024; originally announced September 2024.

arXiv:2409.01259 [pdf, other]

Non-local redundancy: Erasure coding and dispersed replicas for robust retrieval in the Swarm peer-to-peer network

Authors: Viktor Trón, Viktor Tóth, Callum Toner, Dan Nickless, Dániel A. Nagy, Áron Fischer, György Barabás

Abstract: This paper describes in detail how erasure codes are implemented in the Swarm system. First, in Section 1, we introduce erasure codes, and show how to apply them to files in Swarm (Section 2). In Section 3, we introduce security levels of data availability and derive their respective parameterisations. In Section 4, we describe a construct that enables cross-neighbourhood redundancy for singleton… ▽ More This paper describes in detail how erasure codes are implemented in the Swarm system. First, in Section 1, we introduce erasure codes, and show how to apply them to files in Swarm (Section 2). In Section 3, we introduce security levels of data availability and derive their respective parameterisations. In Section 4, we describe a construct that enables cross-neighbourhood redundancy for singleton chunks and which completes erasure coding. Finally, in 5, we propose a number of retrieval strategies applicable to erasure-coded files. △ Less

Submitted 2 September, 2024; originally announced September 2024.

Comments: 14 pages, 7 figures, 4 tables

arXiv:2408.10021 [pdf, other]

Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis

Authors: Kira Maag, Roman Resner, Asja Fischer

Abstract: Deep neural networks have demonstrated remarkable effectiveness across a wide range of tasks such as semantic segmentation. Nevertheless, these networks are vulnerable to adversarial attacks that add imperceptible perturbations to the input image, leading to false predictions. This vulnerability is particularly dangerous in safety-critical applications like automated driving. While adversarial exa… ▽ More Deep neural networks have demonstrated remarkable effectiveness across a wide range of tasks such as semantic segmentation. Nevertheless, these networks are vulnerable to adversarial attacks that add imperceptible perturbations to the input image, leading to false predictions. This vulnerability is particularly dangerous in safety-critical applications like automated driving. While adversarial examples and defense strategies are well-researched in the context of image classification, there is comparatively less research focused on semantic segmentation. Recently, we have proposed an uncertainty-based method for detecting adversarial attacks on neural networks for semantic segmentation. We observed that uncertainty, as measured by the entropy of the output distribution, behaves differently on clean versus adversely perturbed images, and we utilize this property to differentiate between the two. In this extended version of our work, we conduct a detailed analysis of uncertainty-based detection of adversarial attacks including a diverse set of adversarial attacks and various state-of-the-art neural networks. Our numerical experiments show the effectiveness of the proposed uncertainty-based detection method, which is lightweight and operates as a post-processing step, i.e., no model modifications or knowledge of the adversarial example generation process are required. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.07412 [pdf]

Lateral Mn5Ge3 spin-valve in contact with a high-mobility Ge two-dimensional hole gas

Authors: David Weißhaupt, Christoph Sürgers, Dominik Bloos, Hannes Simon Funk, Michael Oehme, Gerda Fischer, Markus Andreas Schubert, Christian Wenger, Joris van Slageren, Inga Anita Fischer, Jörg Schulze

Abstract: Ge two-dimensional hole gases in strained modulation-doped quantum-wells represent a promising material platform for future spintronic applications due to their excellent spin transport properties and the theoretical possibility of efficient spin manipulation. Due to the continuous development of epitaxial growth recipes extreme high hole mobilities and low effective masses can be achieved, promis… ▽ More Ge two-dimensional hole gases in strained modulation-doped quantum-wells represent a promising material platform for future spintronic applications due to their excellent spin transport properties and the theoretical possibility of efficient spin manipulation. Due to the continuous development of epitaxial growth recipes extreme high hole mobilities and low effective masses can be achieved, promising an efficient spin transport. Furthermore, the Ge two-dimensional hole gas (2DHG) can be integrated in the well-established industrial complementary metal-oxide-semiconductor (CMOS) devices technology. However, efficient electrical spin injection into a Ge 2DHG - a prerequisite for the realization of spintronic devices - has not yet been demonstrated. In this work, we report the fabrication and low-temperature magnetoresistance measurements of a laterally structured Mn5Ge3/Ge 2DHG/ Mn5Ge3 device. The ferromagnetic Mn5Ge3 contacts are grown directly into the Ge quantum well by means of an interdiffusion process with a spacing of approximately 130 nm. We observe a magnetoresistance signal for temperatures below 13 K possibly arising from successful spin injection. The results represent a step forward toward the realization of CMOS compatible spintronic devices based on a 2DHG. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 6 figures

arXiv:2407.16422 [pdf, ps, other]

A new problem qualification based on approximate KKT conditions for Lipschitzian optimization with application to bilevel programming

Authors: Isabella Käming, Andreas Fischer, Alain B. Zemkoho

Abstract: When dealing with general Lipschitzian optimization problems, there are many problem classes where standard constraint qualifications fail at local minimizers. In contrast to a constraint qualification, a problem qualification does not only rely on the constraints but also on the objective function to guarantee that a local minimizer is a Karush-Kuhn-Tucker (KKT) point. For example, calmness in th… ▽ More When dealing with general Lipschitzian optimization problems, there are many problem classes where standard constraint qualifications fail at local minimizers. In contrast to a constraint qualification, a problem qualification does not only rely on the constraints but also on the objective function to guarantee that a local minimizer is a Karush-Kuhn-Tucker (KKT) point. For example, calmness in the sense of Clarke is a problem qualification. In this article, we introduce the Subset Mangasarian-Fromovitz Condition (subMFC). This new problem qualification is based on a nonsmooth version of the approximate KKT conditions, which hold at every local minimizer without further assumptions. A comparison with existing constraint qualifications and problem qualifications for the given problem class reveals that subMFC is strictly weaker than quasinormality and can hold even if the local error bound condition, the cone-continuity property, Guignard's constraint qualification and calmness in the sense of Clarke are violated. Further, we emphasize the power of the new problem qualification within the context of bilevel optimization. More precisely, under mild assumptions on the problem data, we suggest a version of subMFC that is tailored to the lower-level value function reformulation. It turns out that this new condition can be satisfied even if the widely used partial calmness condition does not hold. △ Less

Submitted 23 July, 2024; originally announced July 2024.

MSC Class: 90C30; 49J52; 90C46; 65K05

arXiv:2407.15558 [pdf, other]

Retinomorphic Machine Vision in a Network Laser

Authors: Wai Kit Ng, Jakub Dranczewski, Anna Fischer, T V Raziman, Dhruv Saxena, Tobias Farchy, Kilian Stenning, Jonathan Peters, Heinz Schmid, Will R Branford, Mauricio Barahona, Kirsten Moselund, Riccardo Sapienza, Jack C. Gartside

Abstract: With the growing prevalence of AI, demand increases for efficient machine learning hardware. Physical systems are sought which combine image feature detection with the essential nonlinearity for tasks such as image classification. Existing physical hardware typically detects features linearly, then employs digital processing for nonlinear activation. Biological vision systems excel at nonlinear im… ▽ More With the growing prevalence of AI, demand increases for efficient machine learning hardware. Physical systems are sought which combine image feature detection with the essential nonlinearity for tasks such as image classification. Existing physical hardware typically detects features linearly, then employs digital processing for nonlinear activation. Biological vision systems excel at nonlinear image processing. The retina detects features in ganglion cells via lateral inhibition, where cells nonlinearly compete for neuronal firing while supressing neighbouring cells. We present a bio-inspired 'retinomorphic' machine vision platform using an on-chip semiconductor network laser. The system detects multiple features in parallel via spatially-overlapping lasing modes, with integrated nonlinearity provided by antagonistic gain competition between modes - a photonic analogue of retinal inhibition. Parallel feature-detection enhances efficiency relative to feature-detection schemes which operate sequentially or via multiple device copies, with Si-compatible processing and a compact micron-scale footprint relative to existing mm-scale systems. We report 98.05% accuracy on MNIST-digits and 87.85% on Fashion-MNIST, with strong performance on short training datasets. △ Less

Submitted 2 August, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.07967 [pdf, other]

The Lowell Observatory Solar Telescope: A fiber feed into the EXtreme PREcision Spectrometer

Authors: Joe Llama, Lily L. Zhao, John M. Brewer, Andrew Szymkowiak, Debra A. Fischer, Michael Collins, Jake Tiegs, Frank Cornelius

Abstract: The signal induced by a temperate, terrestrial planet orbiting a Sun-like star is an order of magnitude smaller than the host stars' intrinsic variability. Understanding stellar activity is, therefore, a fundamental obstacle in confirming the smallest exoplanets. We present the Lowell Observatory Solar Telescope (LOST), a solar feed for the EXtreme PREcision Spectrometer (EXPRES) at the 4.3-m Lowe… ▽ More The signal induced by a temperate, terrestrial planet orbiting a Sun-like star is an order of magnitude smaller than the host stars' intrinsic variability. Understanding stellar activity is, therefore, a fundamental obstacle in confirming the smallest exoplanets. We present the Lowell Observatory Solar Telescope (LOST), a solar feed for the EXtreme PREcision Spectrometer (EXPRES) at the 4.3-m Lowell Discovery Telescope (LDT). EXPRES is one of the newest high-resolution spectrographs that accurately measure extreme radial velocity. With LOST/EXPRES, we observe disk-integrated sunlight autonomously throughout the day. In clear conditions, we achieve a ~137,500 optical spectrum of the Sun with a signal-to-noise of 500 in ~150s. Data is reduced using the standard EXPRES pipeline with minimal modification to ensure the data are comparable to the observations of other stars with the LDT. During the first three years of operation, we find a daily RMS of 71 cm/s. Additionally, having two EPRV spectrometers located in Arizona gives us an unprecedented opportunity to benchmark the performance of these planet-finders. We find a RMS of just 55 cm/s when comparing data taken simultaneously with EXPRES and NEID. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: SPIE Astronomical Telescopes & Instrumentation proceedings paper

arXiv:2407.06178 [pdf, other]

Transfer Learning with Self-Supervised Vision Transformers for Snake Identification

Authors: Anthony Miyaguchi, Murilo Gustineli, Austin Fischer, Ryan Lundqvist

Abstract: We present our approach for the SnakeCLEF 2024 competition to predict snake species from images. We explore and use Meta's DINOv2 vision transformer model for feature extraction to tackle species' high variability and visual similarity in a dataset of 182,261 images. We perform exploratory analysis on embeddings to understand their structure, and train a linear classifier on the embeddings to pred… ▽ More We present our approach for the SnakeCLEF 2024 competition to predict snake species from images. We explore and use Meta's DINOv2 vision transformer model for feature extraction to tackle species' high variability and visual similarity in a dataset of 182,261 images. We perform exploratory analysis on embeddings to understand their structure, and train a linear classifier on the embeddings to predict species. Despite achieving a score of 39.69, our results show promise for DINOv2 embeddings in snake identification. All code for this project is available at https://github.com/dsgt-kaggle-clef/snakeclef-2024. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Paper submitted to CLEF 2024 CEUR-WS

arXiv:2407.04062 [pdf, other]

Single-mode emission by phase-delayed coupling between nano-lasers

Authors: T. V. Raziman, Anna Fischer, Riccardo Nori, Anthony Chan, Wai Kit Ng, Dhruv Saxena, Ortwin Hess, Korneel Molkens, Ivo Tanghe, Pieter Geiregat, Dries Van Thourhout, Mauricio Barahona, Riccardo Sapienza

Abstract: Near-field coupling between nanolasers enables collective high-power lasing but leads to complex spectral reshaping and multimode operation, limiting the emission brightness, spatial coherence and temporal stability. Many lasing architectures have been proposed to circumvent this limitation, based on symmetries, topology, or interference. We show that a much simpler and robust method exploiting ph… ▽ More Near-field coupling between nanolasers enables collective high-power lasing but leads to complex spectral reshaping and multimode operation, limiting the emission brightness, spatial coherence and temporal stability. Many lasing architectures have been proposed to circumvent this limitation, based on symmetries, topology, or interference. We show that a much simpler and robust method exploiting phase-delayed coupling, where light exchanged by the lasers carries a phase, can enable stable single-mode operation. Phase-delayed coupling changes the modal amplification: for pump powers close to the anyonic parity-time (PT) symmetric exceptional point, a high phase delay completely separates the mode thresholds, leading to single mode operation. This is shown by stability analysis with nonlinear coupled mode theory and stochastic differential equations for two coupled nanolasers and confirmed by realistic semi-analytical treatment of a dimer of lasing nanospheres. Finally, we extend the mode control to large arrays of nanolasers, featuring lowered thresholds and higher power. Our work promises a novel solution to engineer bright and stable single-mode lasing from nanolaser arrays with important applications in photonic chips for communication and lidars. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.02576 [pdf, other]

Supercell Wannier functions and faithful low-energy model for Bernal bilayer graphene

Authors: Ammon Fischer, Lennart Klebl, Dante M. Kennes, Tim O. Wehling

Abstract: We derive a minimal low-energy model for Bernal bilayer graphene and related rhombohedral graphene multilayers at low electronic densities by constructing Wannier orbitals defined in real-space supercells of the original primitive cell. Starting from an ab-initio electronic structure theory comprising the atomic carbon $p_z$-orbitals, momentum locality of the Fermi surface pockets around $K,K'$ is… ▽ More We derive a minimal low-energy model for Bernal bilayer graphene and related rhombohedral graphene multilayers at low electronic densities by constructing Wannier orbitals defined in real-space supercells of the original primitive cell. Starting from an ab-initio electronic structure theory comprising the atomic carbon $p_z$-orbitals, momentum locality of the Fermi surface pockets around $K,K'$ is circumvented by backfolding the $π$-bands to the concomitant mini-Brillouin zone of the supercell, reminiscent of their (twisted) moiré counterparts. The supercell Wannier functions reproduce the spectral weight and Berry curvature of the microscopic model and offer an intuitive real-space picture of the emergent physics at low electronic densities being shaped by flavor-polarized wave packets with mesoscopic extent. By projecting an orbital-resolved, dual-gated Coulomb interaction to the effective Wannier basis, we find that the low-energy physics of Bernal bilayer graphene is governed by weak electron-electron interactions. Our study bridges between existing continuum theories and ab-initio studies of small Fermi pocket systems like rhombohedral graphene stacks by providing a symmetric lattice description of their low-energy physics. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 7 pages, 3 figures, Supplementary Material

arXiv:2407.02299 [pdf, other]

Stein's Method of Moments on the Sphere

Authors: Adrian Fischer, Robert E. Gaunt, Yvik Swan

Abstract: We use Stein characterizations to obtain new moment-type estimators for the parameters of three classical spherical distributions (namely the Fisher-Bingham, the von Mises-Fisher, and the Watson distributions) in the i.i.d. case. This leads to explicit estimators which have good asymptotic properties (close to efficiency) and therefore lead to interesting alternatives to classical maximum likeliho… ▽ More We use Stein characterizations to obtain new moment-type estimators for the parameters of three classical spherical distributions (namely the Fisher-Bingham, the von Mises-Fisher, and the Watson distributions) in the i.i.d. case. This leads to explicit estimators which have good asymptotic properties (close to efficiency) and therefore lead to interesting alternatives to classical maximum likelihood methods or more recent score matching estimators. We perform competitive simulation studies to assess the quality of the new estimators. Finally, the practical relevance of our estimators is illustrated on a real data application in spherical latent representations of handwritten numbers. △ Less

Submitted 15 June, 2024; originally announced July 2024.

arXiv:2406.16300 [pdf, other]

Landscaping Linear Mode Connectivity

Authors: Sidak Pal Singh, Linara Adilova, Michael Kamp, Asja Fischer, Bernhard Schölkopf, Thomas Hofmann

Abstract: The presence of linear paths in parameter space between two different network solutions in certain cases, i.e., linear mode connectivity (LMC), has garnered interest from both theoretical and practical fronts. There has been significant research that either practically designs algorithms catered for connecting networks by adjusting for the permutation symmetries as well as some others that more th… ▽ More The presence of linear paths in parameter space between two different network solutions in certain cases, i.e., linear mode connectivity (LMC), has garnered interest from both theoretical and practical fronts. There has been significant research that either practically designs algorithms catered for connecting networks by adjusting for the permutation symmetries as well as some others that more theoretically construct paths through which networks can be connected. Yet, the core reasons for the occurrence of LMC, when in fact it does occur, in the highly non-convex loss landscapes of neural networks are far from clear. In this work, we take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC (or the lack thereof) to manifest. Concretely, we present a `mountainside and ridge' perspective that helps to neatly tie together different geometric features that can be spotted in the loss landscape along the training runs. We also complement this perspective by providing a theoretical analysis of the barrier height, for which we provide empirical support, and which additionally extends as a faithful predictor of layer-wise LMC. We close with a toy example that provides further intuition on how barriers arise in the first place, all in all, showcasing the larger aim of the work -- to provide a working model of the landscape and its topography for the occurrence of LMC. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: ICML 2024 HiLD workshop paper

arXiv:2406.02868 [pdf, other]

Bayesian Adaptive Trials for Social Policy

Authors: Sally Cripps, Anna Lopatnikova, Hadi Mohasel Afshar, Ben Gales, Roman Marchant, Gilad Francis, Catarina Moreira, Alex Fischer

Abstract: This paper proposes Bayesian Adaptive Trials (BAT) as both an efficient method to conduct trials and a unifying framework for evaluation social policy interventions, addressing limitations inherent in traditional methods such as Randomized Controlled Trials (RCT). Recognizing the crucial need for evidence-based approaches in public policy, the proposal aims to lower barriers to the adoption of evi… ▽ More This paper proposes Bayesian Adaptive Trials (BAT) as both an efficient method to conduct trials and a unifying framework for evaluation social policy interventions, addressing limitations inherent in traditional methods such as Randomized Controlled Trials (RCT). Recognizing the crucial need for evidence-based approaches in public policy, the proposal aims to lower barriers to the adoption of evidence-based methods and align evaluation processes more closely with the dynamic nature of policy cycles. BATs, grounded in decision theory, offer a dynamic, ``learning as we go'' approach, enabling the integration of diverse information types and facilitating a continuous, iterative process of policy evaluation. BATs' adaptive nature is particularly advantageous in policy settings, allowing for more timely and context-sensitive decisions. Moreover, BATs' ability to value potential future information sources positions it as an optimal strategy for sequential data acquisition during policy implementation. While acknowledging the assumptions and models intrinsic to BATs, such as prior distributions and likelihood functions, the paper argues that these are advantageous for decision-makers in social policy, effectively merging the best features of various methodologies. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.14529 [pdf, other]

AnomalyDINO: Boosting Patch-based Few-shot Anomaly Detection with DINOv2

Authors: Simon Damm, Mike Laszkiewicz, Johannes Lederer, Asja Fischer

Abstract: Recent advances in multimodal foundation models have set new standards in few-shot anomaly detection. This paper explores whether high-quality visual features alone are sufficient to rival existing state-of-the-art vision-language models. We affirm this by adapting DINOv2 for one-shot and few-shot anomaly detection, with a focus on industrial applications. We show that this approach does not only… ▽ More Recent advances in multimodal foundation models have set new standards in few-shot anomaly detection. This paper explores whether high-quality visual features alone are sufficient to rival existing state-of-the-art vision-language models. We affirm this by adapting DINOv2 for one-shot and few-shot anomaly detection, with a focus on industrial applications. We show that this approach does not only rival existing techniques but can even outmatch them in many settings. Our proposed vision-only approach, AnomalyDINO, is based on patch similarities and enables both image-level anomaly prediction and pixel-level anomaly segmentation. The approach is methodologically simple and training-free and, thus, does not require any additional data for fine-tuning or meta-learning. Despite its simplicity, AnomalyDINO achieves state-of-the-art results in one- and few-shot anomaly detection (e.g., pushing the one-shot performance on MVTec-AD from an AUROC of 93.1% to 96.6%). The reduced overhead, coupled with its outstanding few-shot performance, makes AnomalyDINO a strong candidate for fast deployment, e.g., in industrial contexts. △ Less

Submitted 12 September, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

arXiv:2404.16442 [pdf, other]

Contextual Categorization Enhancement through LLMs Latent-Space

Authors: Zineddine Bettouche, Anas Safi, Andreas Fischer

Abstract: Managing the semantic quality of the categorization in large textual datasets, such as Wikipedia, presents significant challenges in terms of complexity and cost. In this paper, we propose leveraging transformer models to distill semantic information from texts in the Wikipedia dataset and its associated categories into a latent space. We then explore different approaches based on these encodings… ▽ More Managing the semantic quality of the categorization in large textual datasets, such as Wikipedia, presents significant challenges in terms of complexity and cost. In this paper, we propose leveraging transformer models to distill semantic information from texts in the Wikipedia dataset and its associated categories into a latent space. We then explore different approaches based on these encodings to assess and enhance the semantic identity of the categories. Our graphical approach is powered by Convex Hull, while we utilize Hierarchical Navigable Small Worlds (HNSWs) for the hierarchical approach. As a solution to the information loss caused by the dimensionality reduction, we modulate the following mathematical solution: an exponential decay function driven by the Euclidean distances between the high-dimensional encodings of the textual categories. This function represents a filter built around a contextual category and retrieves items with a certain Reconsideration Probability (RP). Retrieving high-RP items serves as a tool for database administrators to improve data groupings by providing recommendations and identifying outliers within a contextual framework. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Journal ref: Fifteenth International Conference on Computational Logics, Algebras, Programming, Tools, and Benchmarking (COMPUTATION TOOLS 2024), ISSN: 2308-4170

arXiv:2404.14244 [pdf, other]

AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images

Authors: Jonas Ricker, Dennis Assenmacher, Thorsten Holz, Asja Fischer, Erwin Quiring

Abstract: Recent advances in the field of generative artificial intelligence (AI) have blurred the lines between authentic and machine-generated content, making it almost impossible for humans to distinguish between such media. One notable consequence is the use of AI-generated images for fake profiles on social media. While several types of disinformation campaigns and similar incidents have been reported… ▽ More Recent advances in the field of generative artificial intelligence (AI) have blurred the lines between authentic and machine-generated content, making it almost impossible for humans to distinguish between such media. One notable consequence is the use of AI-generated images for fake profiles on social media. While several types of disinformation campaigns and similar incidents have been reported in the past, a systematic analysis has been lacking. In this work, we conduct the first large-scale investigation of the prevalence of AI-generated profile pictures on Twitter. We tackle the challenges of a real-world measurement study by carefully integrating various data sources and designing a multi-stage detection pipeline. Our analysis of nearly 15 million Twitter profile pictures shows that 0.052% were artificially generated, confirming their notable presence on the platform. We comprehensively examine the characteristics of these accounts and their tweet content, and uncover patterns of coordinated inauthentic behavior. The results also reveal several motives, including spamming and political amplification campaigns. Our research reaffirms the need for effective detection and mitigation strategies to cope with the potential negative effects of generative AI in the future. △ Less

Submitted 6 August, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: Accepted to RAID 2024

arXiv:2403.17840 [pdf, other]

doi 10.1103/PhysRevB.109.155430

Negative electronic compressibility in charge islands in twisted bilayer graphene

Authors: Robin J. Dolleman, Alexander Rothstein, Ammon Fischer, Lennart Klebl, Lutz Waldecker, Kenji Watanabe, Takashi Taniguchi, Dante M. Kennes, Florian Libisch, Bernd Beschoten, Christoph Stampfer

Abstract: We report on the observation of negative electronic compressibility in twisted bilayer graphene for Fermi energies close to insulating states. To observe this negative compressibility, we take advantage of naturally occurring twist angle domains that emerge during the fabrication of the samples, leading to the formation of charge islands. We accurately measure their capacitance using Coulomb oscil… ▽ More We report on the observation of negative electronic compressibility in twisted bilayer graphene for Fermi energies close to insulating states. To observe this negative compressibility, we take advantage of naturally occurring twist angle domains that emerge during the fabrication of the samples, leading to the formation of charge islands. We accurately measure their capacitance using Coulomb oscillations, from which we infer the compressibility of the electron gas. Notably, we not only observe the negative electronic compressibility near correlated insulating states at integer filling, but also prominently near the band insulating state at full filling, located at the edges of both the flat- and remote bands. Furthermore, the individual twist angle domains yield a well-defined carrier density, enabling us to quantify the strength of electronic interactions and verify the theoretical prediction that the inverse negative capacitance contribution is proportional to the average distance between the charge carriers. A detailed analysis of our findings suggests that Wigner crystallization is the most likely explanation for the observed negative electronic compressibility. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.06939 [pdf, other]

Surface lattice resonance lasers with epitaxial InP gain medium

Authors: Anna Fischer, Toby Severs Millard, Xiaofei Xiao, T. V. Raziman, Jakub Dranczewski, Ross C. Schofield, Heinz Schmid, Kirsten Moselund, Riccardo Sapienza, Rupert Oulton

Abstract: Surface lattice resonance (SLR) lasers, where gain is supplied by a thin film active material and the feedback comes from multiple scattering by plasmonic nanoparticles, have shown both low threshold lasing and tunability of the angular and spectral emission. However, typically used materials such as organic dyes and QD films suffer from photo-degradation which hampers practical applications. Here… ▽ More Surface lattice resonance (SLR) lasers, where gain is supplied by a thin film active material and the feedback comes from multiple scattering by plasmonic nanoparticles, have shown both low threshold lasing and tunability of the angular and spectral emission. However, typically used materials such as organic dyes and QD films suffer from photo-degradation which hampers practical applications. Here, we demonstrate photo-stable single-mode lasing of SLR modes sustained in an epitaxial solid-state InP slab waveguide. The nanoparticle array is weakly coupled to the optical modes, which decreases the scattering losses and hence the experimental lasing threshold is as low as 90 $μ$J/cm$^{2}$. The nanoparticle periodicity defines the lasing wavelength and enables tuneable emission wavelengths over a 70 nm spectral range. Combining plasmonic nanoparticles with an epitaxial solid-state gain medium paves the way for large-area on-chip integrated SLR lasers for applications including optical communication, optical computing, sensing, and LiDAR. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.00025 [pdf, ps, other]

On the Challenges and Opportunities in Generative AI

Authors: Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin

Abstract: The field of deep generative modeling has grown rapidly and consistently over the years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue t… ▽ More The field of deep generative modeling has grown rapidly and consistently over the years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue that current large-scale generative AI models do not sufficiently address several fundamental issues that hinder their widespread adoption across domains. In this work, we aim to identify key unresolved challenges in modern generative AI paradigms that should be tackled to further enhance their capabilities, versatility, and reliability. By identifying these challenges, we aim to provide researchers with valuable insights for exploring fruitful research directions, thereby fostering the development of more robust and accessible generative AI solutions. △ Less

Submitted 28 February, 2024; originally announced March 2024.

arXiv:2402.13404 [pdf, other]

Layout-to-Image Generation with Localized Descriptions using ControlNet with Cross-Attention Control

Authors: Denis Lukovnikov, Asja Fischer

Abstract: While text-to-image diffusion models can generate highquality images from textual descriptions, they generally lack fine-grained control over the visual composition of the generated images. Some recent works tackle this problem by training the model to condition the generation process on additional input describing the desired image layout. Arguably the most popular among such methods, ControlNet,… ▽ More While text-to-image diffusion models can generate highquality images from textual descriptions, they generally lack fine-grained control over the visual composition of the generated images. Some recent works tackle this problem by training the model to condition the generation process on additional input describing the desired image layout. Arguably the most popular among such methods, ControlNet, enables a high degree of control over the generated image using various types of conditioning inputs (e.g. segmentation maps). However, it still lacks the ability to take into account localized textual descriptions that indicate which image region is described by which phrase in the prompt. In this work, we show the limitations of ControlNet for the layout-to-image task and enable it to use localized descriptions using a training-free approach that modifies the crossattention scores during generation. We adapt and investigate several existing cross-attention control methods in the context of ControlNet and identify shortcomings that cause failure (concept bleeding) or image degradation under specific conditions. To address these shortcomings, we develop a novel cross-attention manipulation method in order to maintain image quality while improving control. Qualitative and quantitative experimental studies focusing on challenging cases are presented, demonstrating the effectiveness of the investigated general approach, and showing the improvements obtained by the proposed cross-attention control method. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2401.17879 [pdf, other]

AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error

Authors: Jonas Ricker, Denis Lukovnikov, Asja Fischer

Abstract: With recent text-to-image models, anyone can generate deceptively realistic images with arbitrary contents, fueling the growing threat of visual disinformation. A key enabler for generating high-resolution images with low computational cost has been the development of latent diffusion models (LDMs). In contrast to conventional diffusion models, LDMs perform the denoising process in the low-dimensi… ▽ More With recent text-to-image models, anyone can generate deceptively realistic images with arbitrary contents, fueling the growing threat of visual disinformation. A key enabler for generating high-resolution images with low computational cost has been the development of latent diffusion models (LDMs). In contrast to conventional diffusion models, LDMs perform the denoising process in the low-dimensional latent space of a pre-trained autoencoder (AE) instead of the high-dimensional image space. Despite their relevance, the forensic analysis of LDMs is still in its infancy. In this work we propose AEROBLADE, a novel detection method which exploits an inherent component of LDMs: the AE used to transform images between image and latent space. We find that generated images can be more accurately reconstructed by the AE than real images, allowing for a simple detection approach based on the reconstruction error. Most importantly, our method is easy to implement and does not require any training, yet nearly matches the performance of detectors that rely on extensive training. We empirically demonstrate that AEROBLADE is effective against state-of-the-art LDMs, including Stable Diffusion and Midjourney. Beyond detection, our approach allows for the qualitative analysis of images, which can be leveraged for identifying inpainted regions. We release our code and data at https://github.com/jonasricker/aeroblade . △ Less

Submitted 27 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: Accepted to CVPR 2024

arXiv:2401.13555 [pdf, other]

doi 10.1145/3630106.3658921

Benchmarking the Fairness of Image Upsampling Methods

Authors: Mike Laszkiewicz, Imant Daunhawer, Julia E. Vogt, Asja Fischer, Johannes Lederer

Abstract: Recent years have witnessed a rapid development of deep generative models for creating synthetic media, such as images and videos. While the practical applications of these models in everyday tasks are enticing, it is crucial to assess the inherent risks regarding their fairness. In this work, we introduce a comprehensive framework for benchmarking the performance and fairness of conditional gener… ▽ More Recent years have witnessed a rapid development of deep generative models for creating synthetic media, such as images and videos. While the practical applications of these models in everyday tasks are enticing, it is crucial to assess the inherent risks regarding their fairness. In this work, we introduce a comprehensive framework for benchmarking the performance and fairness of conditional generative models. We develop a set of metrics$\unicode{x2013}$inspired by their supervised fairness counterparts$\unicode{x2013}$to evaluate the models on their fairness and diversity. Focusing on the specific application of image upsampling, we create a benchmark covering a wide variety of modern upsampling methods. As part of the benchmark, we introduce UnfairFace, a subset of FairFace that replicates the racial distribution of common large-scale face datasets. Our empirical study highlights the importance of using an unbiased training set and reveals variations in how the algorithms respond to dataset imbalances. Alarmingly, we find that none of the considered methods produces statistically fair and diverse results. All experiments can be reproduced using our provided repository. △ Less

Submitted 29 April, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published at the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24)

arXiv:2312.09344 [pdf, ps, other]

Stein's method of moments for truncated multivariate distributions

Authors: Adrian Fischer, Robert E. Gaunt, Yvik Swan

Abstract: We use Stein characterisations to derive new moment-type estimators for the parameters of several truncated multivariate distributions in the i.i.d. case; we also derive the asymptotic properties of these estimators. Our examples include the truncated multivariate normal distribution and truncated products of independent univariate distributions. The estimators are explicit and therefore provide a… ▽ More We use Stein characterisations to derive new moment-type estimators for the parameters of several truncated multivariate distributions in the i.i.d. case; we also derive the asymptotic properties of these estimators. Our examples include the truncated multivariate normal distribution and truncated products of independent univariate distributions. The estimators are explicit and therefore provide an interesting alternative to the maximum-likelihood estimator (MLE). The quality of these estimators is assessed through competitive simulation studies, in which we compare their behaviour to the performance of the MLE and the score matching approach. △ Less

Submitted 15 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: The original preprint ("Stein estimation in a multivariate setting") is now split into two preprints: "Stein's method of moments for truncated multivariate distributions" and "Stein's Method of Moments on the Sphere"

arXiv:2312.09037 [pdf, other]

doi 10.1145/3628797.3628976

Impact of Ground Truth Quality on Handwriting Recognition

Authors: Michael Jungo, Lars Vögtlin, Atefeh Fakhari, Nathan Wegmann, Rolf Ingold, Andreas Fischer, Anna Scius-Bertrand

Abstract: Handwriting recognition is a key technology for accessing the content of old manuscripts, helping to preserve cultural heritage. Deep learning shows an impressive performance in solving this task. However, to achieve its full potential, it requires a large amount of labeled data, which is difficult to obtain for ancient languages and scripts. Often, a trade-off has to be made between ground truth… ▽ More Handwriting recognition is a key technology for accessing the content of old manuscripts, helping to preserve cultural heritage. Deep learning shows an impressive performance in solving this task. However, to achieve its full potential, it requires a large amount of labeled data, which is difficult to obtain for ancient languages and scripts. Often, a trade-off has to be made between ground truth quantity and quality, as is the case for the recently introduced Bullinger database. It contains an impressive amount of over a hundred thousand labeled text line images of mostly premodern German and Latin texts that were obtained by automatically aligning existing page-level transcriptions with text line images. However, the alignment process introduces systematic errors, such as wrongly hyphenated words. In this paper, we investigate the impact of such errors on training and evaluation and suggest means to detect and correct typical alignment errors. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: SOICT 2023

Journal ref: SOICT 2023: The 12th International Symposium on Information and Communication Technology

arXiv:2312.05976 [pdf, other]

A Representative Study on Human Detection of Artificially Generated Media Across Countries

Authors: Joel Frank, Franziska Herbert, Jonas Ricker, Lea Schönherr, Thorsten Eisenhofer, Asja Fischer, Markus Dürmuth, Thorsten Holz

Abstract: AI-generated media has become a threat to our digital society as we know it. These forgeries can be created automatically and on a large scale based on publicly available technology. Recognizing this challenge, academics and practitioners have proposed a multitude of automatic detection strategies to detect such artificial media. However, in contrast to these technical advances, the human percepti… ▽ More AI-generated media has become a threat to our digital society as we know it. These forgeries can be created automatically and on a large scale based on publicly available technology. Recognizing this challenge, academics and practitioners have proposed a multitude of automatic detection strategies to detect such artificial media. However, in contrast to these technical advances, the human perception of generated media has not been thoroughly studied yet. In this paper, we aim at closing this research gap. We perform the first comprehensive survey into people's ability to detect generated media, spanning three countries (USA, Germany, and China) with 3,002 participants across audio, image, and text media. Our results indicate that state-of-the-art forgeries are almost indistinguishable from "real" media, with the majority of participants simply guessing when asked to rate them as human- or machine-generated. In addition, AI-generated media receive is voted more human like across all media types and all countries. To further understand which factors influence people's ability to detect generated media, we include personal variables, chosen based on a literature review in the domains of deepfake and fake news research. In a regression analysis, we found that generalized trust, cognitive reflection, and self-reported familiarity with deepfakes significantly influence participant's decision across all media categories. △ Less

Submitted 10 December, 2023; originally announced December 2023.

Comments: Security and Privacy 2024 (S&P 24)

arXiv:2311.02494 [pdf, other]

Phonon-mediated unconventional $s$- and $f$-wave pairing superconductivity in rhombohedral stacked multilayer graphene

Authors: Emil Viñas Boström, Ammon Fischer, Jonas B. Profe, Jin Zhang, Dante M. Kennes, Angel Rubio

Abstract: Understanding the origin of superconductivity in correlated two-dimensional materials is a key step in leveraging material engineering techniques for next-generation nanoscale devices. The recent demonstration of superconductivity in Bernal bilayer and rhombohedral trilayer graphene, as well as in a large family of graphene-based moiré systems, indicate a common superconducting mechanism across th… ▽ More Understanding the origin of superconductivity in correlated two-dimensional materials is a key step in leveraging material engineering techniques for next-generation nanoscale devices. The recent demonstration of superconductivity in Bernal bilayer and rhombohedral trilayer graphene, as well as in a large family of graphene-based moiré systems, indicate a common superconducting mechanism across these platforms. Here we combine first principles simulations with effective low-energy theories to investigate the superconducting mechanism and pairing symmetry in rhombohedral stacked graphene multilayers. We find that a phonon-mediated attraction can quantitatively explain the main experimental findings, namely the displacement field and doping dependence of the critical temperature and the presence of two superconducting regions whose pairing symmetries depend on the parent normal state. In particular, we find that intra-valley phonon scattering favors a triplet $f$-wave pairing out of a spin and valley polarized normal state. We also propose a new and so far unexplored superconducting region at higher hole doping densities $n_h \approx 4 \times 10^{12}$ cm$^{-2}$, and demonstrate how this large hole-doped regime can be reached in heterostructures consisting of monolayer $α$-RuCl$_3$ and rhombohedral trilayer graphene. △ Less

Submitted 4 November, 2023; originally announced November 2023.

Comments: 11 pages, 4 figures

arXiv:2311.01888 [pdf, other]

Learning Sparse Codes with Entropy-Based ELBOs

Authors: Dmytro Velychko, Simon Damm, Asja Fischer, Jörg Lücke

Abstract: Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilis… ▽ More Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) unlike for previous non-trivial approximations, the novel objective is fully analytical; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytical solutions, which leads to the fully analytical objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) for non-trivial posterior approximations, we provide the (to the knowledge of the authors) first analytical ELBO objective for standard probabilistic sparse coding; and (2) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning. △ Less

Submitted 9 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.17436 [pdf, other]

Uncertainty-weighted Loss Functions for Improved Adversarial Attacks on Semantic Segmentation

Authors: Kira Maag, Asja Fischer

Abstract: State-of-the-art deep neural networks have been shown to be extremely powerful in a variety of perceptual tasks like semantic segmentation. However, these networks are vulnerable to adversarial perturbations of the input which are imperceptible for humans but lead to incorrect predictions. Treating image segmentation as a sum of pixel-wise classifications, adversarial attacks developed for classif… ▽ More State-of-the-art deep neural networks have been shown to be extremely powerful in a variety of perceptual tasks like semantic segmentation. However, these networks are vulnerable to adversarial perturbations of the input which are imperceptible for humans but lead to incorrect predictions. Treating image segmentation as a sum of pixel-wise classifications, adversarial attacks developed for classification models were shown to be applicable to segmentation models as well. In this work, we present simple uncertainty-based weighting schemes for the loss functions of such attacks that (i) put higher weights on pixel classifications which can more easily perturbed and (ii) zero-out the pixel-wise losses corresponding to those pixels that are already confidently misclassified. The weighting schemes can be easily integrated into the loss function of a range of well-known adversarial attackers with minimal additional computational overhead, but lead to significant improved perturbation performance, as we demonstrate in our empirical analysis on several datasets and models. △ Less

Submitted 26 October, 2023; originally announced October 2023.

arXiv:2309.10331 [pdf, other]

Hardness results for decoding the surface code with Pauli noise

Authors: Alex Fischer, Akimasa Miyake

Abstract: Real quantum computers will be subject to complicated, qubit-dependent noise, instead of simple noise such as depolarizing noise with the same strength for all qubits. We can do quantum error correction more effectively if our decoding algorithms take into account this prior information about the specific noise present. This motivates us to consider the complexity of surface code decoding where th… ▽ More Real quantum computers will be subject to complicated, qubit-dependent noise, instead of simple noise such as depolarizing noise with the same strength for all qubits. We can do quantum error correction more effectively if our decoding algorithms take into account this prior information about the specific noise present. This motivates us to consider the complexity of surface code decoding where the input to the decoding problem is not only the syndrome-measurement results, but also a noise model in the form of probabilities of single-qubit Pauli errors for every qubit. In this setting, we show that quantum maximum likelihood decoding (QMLD) and degenerate quantum maximum likelihood decoding (DQMLD) for the surface code are NP-hard and #P-hard, respectively. We reduce directly from SAT for QMLD, and from #SAT for DQMLD, by showing how to transform a boolean formula into a qubit-dependent Pauli noise model and set of syndromes that encode the satisfiability properties of the formula. We also give hardness of approximation results for QMLD and DQMLD. These are worst-case hardness results that do not contradict the empirical fact that many efficient surface code decoders are correct in the average case (i.e., for most sets of syndromes and for most reasonable noise models). These hardness results are nicely analogous with the known hardness results for QMLD and DQMLD for arbitrary stabilizer codes with independent $X$ and $Z$ noise. △ Less

Submitted 5 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: 44 pages, 21 figures. 29 pages, 13 figures in main text. This version includes minor improvements to explanations, more standardized terminology, and minor extensions of the results in Appendices C and D

arXiv:2309.03762 [pdf, other]

The Extreme Stellar-Signals Project III. Combining Solar Data from HARPS, HARPS-N, EXPRES, and NEID

Authors: Lily L. Zhao, Xavier Dumusque, Eric B. Ford, Joe Llama, Annelies Mortier, Megan Bedell, Khaled Al Moulla, Chad F. Bender, Cullen H. Blake, John M. Brewer, Andrew Collier Cameron, Rosario Cosentino, Pedro Figueira, Debra A. Fischer, Adriano Ghedina, Manuel Gonzalez, Samuel Halverson, Shubham Kanodia, David W. Latham, Andrea S. J. Lin, Gaspare Lo Curto, Marcello Lodi, Sarah E. Logsdon, Christophe Lovis, Suvrath Mahadevan , et al. (15 additional authors not shown)

Abstract: We present an analysis of Sun-as-a-star observations from four different high-resolution, stabilized spectrographs -- HARPS, HARPS-N, EXPRES, and NEID. With simultaneous observations of the Sun from four different instruments, we are able to gain insight into the radial velocity precision and accuracy delivered by each of these instruments and isolate instrumental systematics that differ from true… ▽ More We present an analysis of Sun-as-a-star observations from four different high-resolution, stabilized spectrographs -- HARPS, HARPS-N, EXPRES, and NEID. With simultaneous observations of the Sun from four different instruments, we are able to gain insight into the radial velocity precision and accuracy delivered by each of these instruments and isolate instrumental systematics that differ from true astrophysical signals. With solar observations, we can completely characterize the expected Doppler shift contributed by orbiting Solar System bodies and remove them. This results in a data set with measured velocity variations that purely trace flows on the solar surface. Direct comparisons of the radial velocities measured by each instrument show remarkable agreement with residual intra-day scatter of only 15-30 cm/s. This shows that current ultra-stabilized instruments have broken through to a new level of measurement precision that reveals stellar variability with high fidelity and detail. We end by discussing how radial velocities from different instruments can be combined to provide powerful leverage for testing techniques to mitigate stellar signals. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: 17 pages, 9 figures, accepted for publication

arXiv:2309.03072 [pdf, other]

doi 10.1007/978-3-031-41676-7_6

Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation

Authors: Michael Jungo, Beat Wolf, Andrii Maksai, Claudiu Musat, Andreas Fischer

Abstract: On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We speci… ▽ More On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We specifically focus on the scenario where the transcription is known beforehand, in which case the character segmentation becomes an assignment problem between sampling points of the stylus trajectory and characters in the text. Inspired by the $k$-means clustering algorithm, we view it from the perspective of cluster assignment and present a Transformer-based architecture where each cluster is formed based on a learned character query in the Transformer decoder block. In order to assess the quality of our approach, we create character segmentation ground truths for two popular on-line handwriting datasets, IAM-OnDB and HANDS-VNOnDB, and evaluate multiple methods on them, demonstrating that our approach achieves the overall best results. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: ICDAR 2023 Best Student Paper Award. Code available at https://github.com/jungomi/character-queries

Journal ref: International Conference on Document Analysis and Recognition - ICDAR 2023, pp. 98-114. Cham: Springer Nature Switzerland

arXiv:2307.15067 [pdf, ps, other]

Set-Membership Inference Attacks using Data Watermarking

Authors: Mike Laszkiewicz, Denis Lukovnikov, Johannes Lederer, Asja Fischer

Abstract: In this work, we propose a set-membership inference attack for generative models using deep image watermarking techniques. In particular, we demonstrate how conditional sampling from a generative model can reveal the watermark that was injected into parts of the training data. Our empirical results demonstrate that the proposed watermarking technique is a principled approach for detecting the non-… ▽ More In this work, we propose a set-membership inference attack for generative models using deep image watermarking techniques. In particular, we demonstrate how conditional sampling from a generative model can reveal the watermark that was injected into parts of the training data. Our empirical results demonstrate that the proposed watermarking technique is a principled approach for detecting the non-consensual use of image data in training generative models. △ Less

Submitted 22 June, 2023; originally announced July 2023.

Comments: Preliminary work

arXiv:2307.13417 [pdf, other]

Towards Resolving Word Ambiguity with Word Embeddings

Authors: Matthias Thurnbauer, Johannes Reisinger, Christoph Goller, Andreas Fischer

Abstract: Ambiguity is ubiquitous in natural language. Resolving ambiguous meanings is especially important in information retrieval tasks. While word embeddings carry semantic information, they fail to handle ambiguity well. Transformer models have been shown to handle word ambiguity for complex queries, but they cannot be used to identify ambiguous words, e.g. for a 1-word query. Furthermore, training the… ▽ More Ambiguity is ubiquitous in natural language. Resolving ambiguous meanings is especially important in information retrieval tasks. While word embeddings carry semantic information, they fail to handle ambiguity well. Transformer models have been shown to handle word ambiguity for complex queries, but they cannot be used to identify ambiguous words, e.g. for a 1-word query. Furthermore, training these models is costly in terms of time, hardware resources, and training data, prohibiting their use in specialized environments with sensitive data. Word embeddings can be trained using moderate hardware resources. This paper shows that applying DBSCAN clustering to the latent space can identify ambiguous words and evaluate their level of ambiguity. An automatic DBSCAN parameter selection leads to high-quality clusters, which are semantically coherent and correspond well to the perceived meanings of a given word. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2307.10394 [pdf, other]

Refining the Stellar Parameters of $τ$ Ceti: a Pole-on Solar Analog

Authors: Maria Korolik, Rachael M. Roettenbacher, Debra A. Fischer, Stephen R. Kane, Jean M. Perkins, John D. Monnier, Claire L. Davies, Stefan Kraus, Jean-Baptiste Le Bouquin, Narsireddy Anugu, Tyler Gardner, Cyprien Lanthermann, Gail H. Schaefer, Benjamin Setterholm, John M. Brewer, Joe Llama, Lily L. Zhao, Andrew E. Szymkowiak, Gregory W. Henry

Abstract: To accurately characterize the planets a star may be hosting, stellar parameters must first be well-determined. $τ$ Ceti is a nearby solar analog and often a target for exoplanet searches. Uncertainties in the observed rotational velocities have made constraining $τ$ Ceti's inclination difficult. For planet candidates from radial velocity (RV) observations, this leads to substantial uncertainties… ▽ More To accurately characterize the planets a star may be hosting, stellar parameters must first be well-determined. $τ$ Ceti is a nearby solar analog and often a target for exoplanet searches. Uncertainties in the observed rotational velocities have made constraining $τ$ Ceti's inclination difficult. For planet candidates from radial velocity (RV) observations, this leads to substantial uncertainties in the planetary masses, as only the minimum mass ($m \sin i$) can be constrained with RV. In this paper, we used new long-baseline optical interferometric data from the CHARA Array with the MIRC-X beam combiner and extreme precision spectroscopic data from the Lowell Discovery Telescope with EXPRES to improve constraints on the stellar parameters of $τ$ Ceti. Additional archival data were obtained from a Tennessee State University Automatic Photometric Telescope and the Mount Wilson Observatory HK project. These new and archival data sets led to improved stellar parameter determinations, including a limb-darkened angular diameter of $2.019 \pm 0.012$ mas and rotation period of $46 \pm 4$ days. By combining parameters from our data sets, we obtained an estimate for the stellar inclination of $7\pm7^\circ$. This nearly-pole-on orientation has implications for the previously-reported exoplanets. An analysis of the system dynamics suggests that the planetary architecture described by Feng et al. (2017) may not retain long-term stability for low orbital inclinations. Additionally, the inclination of $τ$ Ceti reveals a misalignment between the inclinations of the stellar rotation axis and the previously-measured debris disk rotation axis ($i_\mathrm{disk} = 35 \pm 10^\circ$). △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 14 pages, 3 figures, 4 tables, 1 appendix, accepted for publication to AJ

arXiv:2307.06966 [pdf, other]

Layer-wise Linear Mode Connectivity

Authors: Linara Adilova, Maksym Andriushchenko, Michael Kamp, Asja Fischer, Martin Jaggi

Abstract: Averaging neural network parameters is an intuitive method for fusing the knowledge of two independent models. It is most prominently used in federated learning. If models are averaged at the end of training, this can only lead to a good performing model if the loss surface of interest is very particular, i.e., the loss in the midpoint between the two models needs to be sufficiently low. This is i… ▽ More Averaging neural network parameters is an intuitive method for fusing the knowledge of two independent models. It is most prominently used in federated learning. If models are averaged at the end of training, this can only lead to a good performing model if the loss surface of interest is very particular, i.e., the loss in the midpoint between the two models needs to be sufficiently low. This is impossible to guarantee for the non-convex losses of state-of-the-art networks. For averaging models trained on vastly different datasets, it was proposed to average only the parameters of particular layers or combinations of layers, resulting in better performing models. To get a better understanding of the effect of layer-wise averaging, we analyse the performance of the models that result from averaging single layers, or groups of layers. Based on our empirical and theoretical investigation, we introduce a novel notion of the layer-wise linear connectivity, and show that deep networks do not have layer-wise barriers between them. △ Less

Submitted 19 March, 2024; v1 submitted 13 July, 2023; originally announced July 2023.

Comments: published at ICLR24

arXiv:2306.16112 [pdf, other]

doi 10.1038/s44310-024-00006-9

Controlling lasing around Exceptional Points in Coupled Nanolasers

Authors: Anna Fischer, T. V. Raziman, Wai Kit Ng, Jente Clarysse, Jakub Dranczewski, Dhruv Saxena, Stefano Vezzoli, Heinz Schmid, Kirsten Moselund, Riccardo Sapienza

Abstract: Coupled nanolasers are of growing interest for on-chip optical computation and data transmission, which requires an understanding of how lasers interact to form complex systems. The non-Hermitian interaction between two coupled resonators, when excited selectively, can lead to parity-time symmetry, the formation of exceptional points, and subsequently spectral control and increased sensitivity. Th… ▽ More Coupled nanolasers are of growing interest for on-chip optical computation and data transmission, which requires an understanding of how lasers interact to form complex systems. The non-Hermitian interaction between two coupled resonators, when excited selectively, can lead to parity-time symmetry, the formation of exceptional points, and subsequently spectral control and increased sensitivity. These investigations have been limited to pump energies close to the lasing threshold, and large or narrow-line lasers. Here, by programmable optical excitation we study two coupled nanolasers significantly above threshold, where mode instability plays an important role. We map the mode evolution around two exceptional points, and observe lasing gaps due to reversed pump dependence which compare well with nonlinear theory. Finally, the coupling can be exploited to control the lasing threshold and wavelength, and for frequency switching around the lasing gap. Controlled and integrated nanolasers constitutes a promising platform for future highly sensitive and programmable on-chip laser sources. △ Less

Submitted 28 June, 2023; originally announced June 2023.

Comments: 8 pages, 4 figures

arXiv:2306.09049 [pdf, other]

Mapping Researcher Activity based on Publication Data by means of Transformers

Authors: Zineddine Bettouche, Andreas Fischer

Abstract: Modern performance on several natural language processing (NLP) tasks has been enhanced thanks to the Transformer-based pre-trained language model BERT. We employ this concept to investigate a local publication database. Research papers are encoded and clustered to form a landscape view of the scientific topics, in which research is active. Authors working on similar topics can be identified by ca… ▽ More Modern performance on several natural language processing (NLP) tasks has been enhanced thanks to the Transformer-based pre-trained language model BERT. We employ this concept to investigate a local publication database. Research papers are encoded and clustered to form a landscape view of the scientific topics, in which research is active. Authors working on similar topics can be identified by calculating the similarity between their papers. Based on this, we define a similarity metric between authors. Additionally we introduce the concept of self-similarity to indicate the topical variety of authors. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: Proc. of the Interdisciplinary Conference on Mechanics, Computers and Electrics (ICMECE 2022)

arXiv:2306.09044 [pdf, other]

Hands-on detection for steering wheels with neural networks

Authors: Michael Hollmer, Andreas Fischer

Abstract: In this paper the concept of a machine learning based hands-on detection algorithm is proposed. The hand detection is implemented on the hardware side using a capacitive method. A sensor mat in the steering wheel detects a change in capacity as soon as the driver's hands come closer. The evaluation and final decision about hands-on or hands-off situations is done using machine learning. In order t… ▽ More In this paper the concept of a machine learning based hands-on detection algorithm is proposed. The hand detection is implemented on the hardware side using a capacitive method. A sensor mat in the steering wheel detects a change in capacity as soon as the driver's hands come closer. The evaluation and final decision about hands-on or hands-off situations is done using machine learning. In order to find a suitable machine learning model, different models are implemented and evaluated. Based on accuracy, memory consumption and computational effort the most promising one is selected and ported on a micro controller. The entire system is then evaluated in terms of reliability and response time. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: Proc. of the Interdisciplinary Conference on Mechanics, Computers and Electrics (ICMECE 2022)

arXiv:2306.09039 [pdf, other]

Improving Image Tracing with Convolutional Autoencoders by High-Pass Filter Preprocessing

Authors: Zineddine Bettouche, Andreas Fischer

Abstract: The process of transforming a raster image into a vector representation is known as image tracing. This study looks into several processing methods that include high-pass filtering, autoencoding, and vectorization to extract an abstract representation of an image. According to the findings, rebuilding an image with autoencoders, high-pass filtering it, and then vectorizing it can represent the ima… ▽ More The process of transforming a raster image into a vector representation is known as image tracing. This study looks into several processing methods that include high-pass filtering, autoencoding, and vectorization to extract an abstract representation of an image. According to the findings, rebuilding an image with autoencoders, high-pass filtering it, and then vectorizing it can represent the image more abstractly while increasing the effectiveness of the vectorization process. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Journal ref: IARIA Journal on Advances in Software, ISSN: 1942-2628, vol. 15, pp. 141-151, 2022

arXiv:2306.06888 [pdf, other]

doi 10.3847/1538-3881/acdd6f

EXPRES IV: Two Additional Planets Orbiting $ρ$ Coronae Borealis Reveal Uncommon System Architecture

Authors: John M. Brewer, Lily L. Zhao, Debra A. Fischer, Rachael M. Roettenbacher, Gregory W. Henry, Joe Llama, Andrew E. Szymkowiak, Samuel H. C. Cabot, Sam A. Weiss, Chris McCarthy

Abstract: Thousands of exoplanet detections have been made over the last twenty-five years using Doppler observations, transit photometry, direct imaging, and astrometry. Each of these methods is sensitive to different ranges of orbital separations and planetary radii (or masses). This makes it difficult to fully characterize exoplanet architectures and to place our solar system in context with the wealth o… ▽ More Thousands of exoplanet detections have been made over the last twenty-five years using Doppler observations, transit photometry, direct imaging, and astrometry. Each of these methods is sensitive to different ranges of orbital separations and planetary radii (or masses). This makes it difficult to fully characterize exoplanet architectures and to place our solar system in context with the wealth of discoveries that have been made. Here, we use the EXtreme PREcision Spectrograph (EXPRES) to reveal planets in previously undetectable regions of the mass-period parameter space for the star $ρ$ Coronae Borealis. We add two new planets to the previously known system with one hot Jupiter in a 39-day orbit and a warm super-Neptune in a 102-day orbit. The new detections include a temperate Neptune planet ($M{\sin{i}} \sim 20$ M$_\oplus$) in a 281.4-day orbit and a hot super-Earth ($M{\sin{i}} = 3.7$ M$_\oplus$) in a 12.95-day orbit. This result shows that details of planetary system architectures have been hiding just below our previous detection limits; this signals an exciting era for the next generation of extreme precision spectrographs. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: Accepted to AJ; 20 pages, 13 figures, 5 Tables

arXiv:2306.06210 [pdf, other]

Single-Model Attribution of Generative Models Through Final-Layer Inversion

Authors: Mike Laszkiewicz, Jonas Ricker, Johannes Lederer, Asja Fischer

Abstract: Recent breakthroughs in generative modeling have sparked interest in practical single-model attribution. Such methods predict whether a sample was generated by a specific generator or not, for instance, to prove intellectual property theft. However, previous works are either limited to the closed-world setting or require undesirable changes to the generative model. We address these shortcomings by… ▽ More Recent breakthroughs in generative modeling have sparked interest in practical single-model attribution. Such methods predict whether a sample was generated by a specific generator or not, for instance, to prove intellectual property theft. However, previous works are either limited to the closed-world setting or require undesirable changes to the generative model. We address these shortcomings by, first, viewing single-model attribution through the lens of anomaly detection. Arising from this change of perspective, we propose FLIPAD, a new approach for single-model attribution in the open-world setting based on final-layer inversion and anomaly detection. We show that the utilized final-layer inversion can be reduced to a convex lasso optimization problem, making our approach theoretically sound and computationally efficient. The theoretical findings are accompanied by an experimental study demonstrating the effectiveness of our approach and its flexibility to various domains. △ Less

Submitted 26 June, 2024; v1 submitted 26 May, 2023; originally announced June 2023.

Comments: Accepted at the Forty-first International Conference on Machine Learning [ICML2024]

arXiv:2305.19031 [pdf, other]

Stein's Method of Moments

Authors: Bruno Ebner, Adrian Fischer, Robert E. Gaunt, Babette Picker, Yvik Swan

Abstract: Stein operators allow to characterise probability distributions via differential operators. Based on these characterisations, we develop a new method of point estimation for marginal parameters of strictly stationary and ergodic processes, which we call Stein's Method of Moments (SMOM). These SMOM estimators satisfy the desirable classical properties such as consistency and asymptotic normality. A… ▽ More Stein operators allow to characterise probability distributions via differential operators. Based on these characterisations, we develop a new method of point estimation for marginal parameters of strictly stationary and ergodic processes, which we call Stein's Method of Moments (SMOM). These SMOM estimators satisfy the desirable classical properties such as consistency and asymptotic normality. As a consequence of the usually simple form of the operator, we obtain explicit estimators in cases where standard methods such as (pseudo-) maximum likelihood estimation require a numerical procedure to calculate the estimate. In addition, with our approach, one can choose from a large class of test functions which allows to improve significantly on the moment estimator. Moreover, for i.i.d. observations, we retrieve data-dependent functions that result in asymptotically efficient estimators and give a sequence of explicit SMOM estimators that converge to the maximum likelihood estimator. Our simulation study demonstrates that for a number of important univariate continuous probability distributions our SMOM estimators possess excellent small sample behaviour, often outperforming the maximum likelihood estimator and other widely-used methods in terms of lower bias and mean squared error. △ Less

Submitted 15 July, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: Comments and suggestions most welcome

arXiv:2305.17000 [pdf, other]

DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution

Authors: Matías P. Pizarro B., Dorothea Kolossa, Asja Fischer

Abstract: Adversarial attacks can mislead automatic speech recognition (ASR) systems into predicting an arbitrary target text, thus posing a clear security threat. To prevent such attacks, we propose DistriBlock, an efficient detection strategy applicable to any ASR system that predicts a probability distribution over output tokens in each time step. We measure a set of characteristics of this distribution:… ▽ More Adversarial attacks can mislead automatic speech recognition (ASR) systems into predicting an arbitrary target text, thus posing a clear security threat. To prevent such attacks, we propose DistriBlock, an efficient detection strategy applicable to any ASR system that predicts a probability distribution over output tokens in each time step. We measure a set of characteristics of this distribution: the median, maximum, and minimum over the output probabilities, the entropy of the distribution, as well as the Kullback-Leibler and the Jensen-Shannon divergence with respect to the distributions of the subsequent time step. Then, by leveraging the characteristics observed for both benign and adversarial data, we apply binary classifiers, including simple threshold-based classification, ensembles of such classifiers, and neural networks. Through extensive analysis across different state-of-the-art ASR systems and language data sets, we demonstrate the supreme performance of this approach, with a mean area under the receiver operating characteristic curve for distinguishing target adversarial examples against clean and noisy data of 99% and 97%, respectively. To assess the robustness of our method, we show that adaptive adversarial examples that can circumvent DistriBlock are much noisier, which makes them easier to detect through filtering and creates another avenue for preserving the system's robustness. △ Less

Submitted 26 July, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2305.14438 [pdf, other]

Spin and Charge Fluctuation Induced Pairing in ABCB Tetralayer Graphene

Authors: Ammon Fischer, Lennart Klebl, Jonas B. Profe, Alexander Rothstein, Lutz Waldecker, Bernd Beschoten, Tim O. Wehling, Dante M. Kennes

Abstract: Motivated by the recent experimental realization of ABCB stacked tetralayer graphene [Wirth et al., ACS Nano 16, 16617 (2022)], we study correlated phenomena in moiré-less graphene tetralayers for realistic interaction profiles using an orbital resolved random phase approximation approach. We demonstrate that magnetic fluctuations originating from local interactions are crucial close to the van… ▽ More Motivated by the recent experimental realization of ABCB stacked tetralayer graphene [Wirth et al., ACS Nano 16, 16617 (2022)], we study correlated phenomena in moiré-less graphene tetralayers for realistic interaction profiles using an orbital resolved random phase approximation approach. We demonstrate that magnetic fluctuations originating from local interactions are crucial close to the van Hove singularities on the electron- and hole-doped side promoting layer selective ferrimagnetic states. Spin fluctuations around these magnetic states enhance unconventional spin-triplet, valley-singlet superconductivity with $f$-wave symmetry due to intervalley scattering. Charge fluctuations arising from long range Coulomb interactions promote doubly degenerate p-wave superconductivity close to the van Hove singularities. At the conduction band edge of ABCB graphene, we find that both spin and charge fluctuations drive $f$-wave superconductivity. Our analysis suggests a strong competition between superconducting states emerging from long- and short-ranged Coulomb interactions and thus stresses the importance of microscopically derived interaction profiles to make reliable predictions for the origin of superconductivity in graphene based heterostructures. △ Less

Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: 5 pages, 4 figures, supplementary information

arXiv:2305.12825 [pdf, other]

Uncertainty-based Detection of Adversarial Attacks in Semantic Segmentation

Authors: Kira Maag, Asja Fischer

Abstract: State-of-the-art deep neural networks have proven to be highly powerful in a broad range of tasks, including semantic image segmentation. However, these networks are vulnerable against adversarial attacks, i.e., non-perceptible perturbations added to the input image causing incorrect predictions, which is hazardous in safety-critical applications like automated driving. Adversarial examples and de… ▽ More State-of-the-art deep neural networks have proven to be highly powerful in a broad range of tasks, including semantic image segmentation. However, these networks are vulnerable against adversarial attacks, i.e., non-perceptible perturbations added to the input image causing incorrect predictions, which is hazardous in safety-critical applications like automated driving. Adversarial examples and defense strategies are well studied for the image classification task, while there has been limited research in the context of semantic segmentation. First works however show that the segmentation outcome can be severely distorted by adversarial attacks. In this work, we introduce an uncertainty-based approach for the detection of adversarial attacks in semantic segmentation. We observe that uncertainty as for example captured by the entropy of the output distribution behaves differently on clean and perturbed images and leverage this property to distinguish between the two cases. Our method works in a light-weight and post-processing manner, i.e., we do not modify the model or need knowledge of the process used for generating adversarial examples. In a thorough empirical analysis, we demonstrate the ability of our approach to detect perturbed images across multiple types of adversarial attacks. △ Less

Submitted 15 January, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.12647 [pdf, other]

Reflective Linguistic Programming (RLP): A Stepping Stone in Socially-Aware AGI (SocialAGI)

Authors: Kevin A. Fischer

Abstract: This paper presents Reflective Linguistic Programming (RLP), a unique approach to conversational AI that emphasizes self-awareness and strategic planning. RLP encourages models to introspect on their own predefined personality traits, emotional responses to incoming messages, and planned strategies, enabling contextually rich, coherent, and engaging interactions. A striking illustration of RLP's p… ▽ More This paper presents Reflective Linguistic Programming (RLP), a unique approach to conversational AI that emphasizes self-awareness and strategic planning. RLP encourages models to introspect on their own predefined personality traits, emotional responses to incoming messages, and planned strategies, enabling contextually rich, coherent, and engaging interactions. A striking illustration of RLP's potential involves a toy example, an AI persona with an adversarial orientation, a demon named `Bogus' inspired by the children's fairy tale Hansel & Gretel. Bogus exhibits sophisticated behaviors, such as strategic deception and sensitivity to user discomfort, that spontaneously arise from the model's introspection and strategic planning. These behaviors are not pre-programmed or prompted, but emerge as a result of the model's advanced cognitive modeling. The potential applications of RLP in socially-aware AGI (Social AGI) are vast, from nuanced negotiations and mental health support systems to the creation of diverse and dynamic AI personas. Our exploration of deception serves as a stepping stone towards a new frontier in AGI, one filled with opportunities for advanced cognitive modeling and the creation of truly human `digital souls'. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: 12 pages

arXiv:2303.05615 [pdf, other]

The Variance-Gamma Distribution: A Review

Authors: Adrian Fischer, Robert E. Gaunt, Andrey Sarantsev

Abstract: The variance-gamma (VG) distributions form a four-parameter family which includes as special and limiting cases the normal, gamma and Laplace distributions. Some of the numerous applications include financial modelling and distributional approximation on Wiener space. In this review, we provide an up-to-date account of the basic distributional theory of the VG distribution. Properties covered incl… ▽ More The variance-gamma (VG) distributions form a four-parameter family which includes as special and limiting cases the normal, gamma and Laplace distributions. Some of the numerous applications include financial modelling and distributional approximation on Wiener space. In this review, we provide an up-to-date account of the basic distributional theory of the VG distribution. Properties covered include probability and cumulative distribution functions, generating functions, moments and cumulants, mode and median, Stein characterisations, representations in terms of other random variables, and a list of related distributions. We also review methods for parameter estimation and some applications of the VG distribution, including the aforementioned applications to financial modelling and distributional approximation on Wiener space. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: 31pages, 3 figures

MSC Class: Primary 60E05; 62-02; 62E15; 62F10

arXiv:2303.00596 [pdf, other]

Information Plane Analysis for Dropout Neural Networks

Authors: Linara Adilova, Bernhard C. Geiger, Asja Fischer

Abstract: The information-theoretic framework promises to explain the predictive power of neural networks. In particular, the information plane analysis, which measures mutual information (MI) between input and representation as well as representation and output, should give rich insights into the training process. This approach, however, was shown to strongly depend on the choice of estimator of the MI. Th… ▽ More The information-theoretic framework promises to explain the predictive power of neural networks. In particular, the information plane analysis, which measures mutual information (MI) between input and representation as well as representation and output, should give rich insights into the training process. This approach, however, was shown to strongly depend on the choice of estimator of the MI. The problem is amplified for deterministic networks if the MI between input and representation is infinite. Thus, the estimated values are defined by the different approaches for estimation, but do not adequately represent the training process from an information-theoretic perspective. In this work, we show that dropout with continuously distributed noise ensures that MI is finite. We demonstrate in a range of experiments that this enables a meaningful information plane analysis for a class of dropout neural networks that is widely used in practice. △ Less

Submitted 1 March, 2023; originally announced March 2023.

Comments: Published as a conference paper at ICLR2023

arXiv:2302.11020 [pdf]

doi 10.1016/j.icarus.2023.115425

Comparisons of the core and mantle compositions of earth analogs from different terrestrial planet formation scenarios

Authors: Jesse T. Gu, Rebecca A. Fischer, Matthew C. Brennan, Matthew S. Clement, Seth A. Jacobson, Nathan A. Kaib, David P. O'Brien, Sean N. Raymond

Abstract: The chemical compositions of Earth's core and mantle provide insight into the processes that led to their formation. N-body simulations, on the other hand, generally do not contain chemical information, and seek to only reproduce the masses and orbits of the terrestrial planets. These simulations can be grouped into four potentially viable scenarios of Solar System formation (Classical, Annulus, G… ▽ More The chemical compositions of Earth's core and mantle provide insight into the processes that led to their formation. N-body simulations, on the other hand, generally do not contain chemical information, and seek to only reproduce the masses and orbits of the terrestrial planets. These simulations can be grouped into four potentially viable scenarios of Solar System formation (Classical, Annulus, Grand Tack, and Early Instability) for which we compile a total of 433 N-body simulations. We relate the outputs of these simulations to the chemistry of Earth's core and mantle using a melt-scaling law combined with a multi-stage model of core formation. We find the compositions of Earth analogs to be largely governed by the fraction of equilibrating embryo cores and the initial embryo masses in N-body simulations. Simulation type may be important when considering magma ocean lifetimes, where Grand Tack simulations have the largest amounts of material accreted after the last giant impact. However, we cannot rule out any accretion scenarios or initial embryo masses due to the sensitivity of Earth's mantle composition to different parameters and the stochastic nature of N-body simulations. Comparing the last embryo impacts experienced by Earth analogs to specific Moon-forming scenarios, we find the characteristics of the Moon-forming impact are dependent on the initial conditions in N-body simulations where larger initial embryo masses promote larger and slower Moon-forming impactors. Mars-sized initial embryos are most consistent with the canonical hit-and-run scenario onto a solid mantle. Our results suggest that constraining the fraction of equilibrating impactor core and the initial embryo masses in N-body simulations could be significant for understanding both Earth's accretion history and characteristics of the Moon-forming impact. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Showing 1–50 of 477 results for author: Fischer, A