My Skills

Through the years in my professional career, I have worked as a Software Developer, Data Scientist and Researcher. My main focus is on Natural Language Processing (NLP), a technology that aims to help computers understand and interact with humans thorugh the use of language. This has allowed me to develop several technical, analytical and soft skills. My main programming language is Python and my research expertise is broadly on multilingual neural models, semantic analysis, NLP for digital humanities and evaluation of NLP software.

Learn more

Work Experience

I have worked in Industry Jobs, in Academia and also as a Freelancer. I really like building fast prototypes to test new ideas and implement the latest technologies to solve "old school" problems. I've been building NLP models since the use of Recurrent Neural Networks, LSTMs and thorugh the transition into Transformer models; and I have been of course working with the latest Large Language Model tools (however this work is mostly on evaluating how much of the hype out there is actually useful!) I also enjoy working on Digital Humanities, using my tech expertise to help researchers in the humanities analyze big amounts of textual data to answer their questions

Learn more

Education

My studies allowed me to obtain interesting degrees and also get to know people from different places and cultures. I did my Bachelors and Masters in Mexico City, with internships in Madrid and Tokio, summer schools in Tubingen and Bilbao; a PhD in Heidelberg and a Postdoc in Amsterdam.

Learn more

Selected Publications

I have published as a first author and co-author in top-tier conferences from Computational Linguistics, Computer Science and Digital Humanities venues, such as ACL, EMNLP and NAACL

Learn more

Work Experience

Vrije Universiteit Amsterdam

NLP Researcher (Postdoc)
April 2021 - December 2023

As part of the CLTL group, I focused on the development of NLP models and text mining strategies for automatic processing of biographical texts. This included dynamic language analysis of biographies and evaluation of State of the Art models' performance. This work was part of the InTaVia project, which aimed to address major research challenges and bridge the semantic gap between large object databases, biography databases, and users across Europe.

Learn more

Universität Heidelberg

PhD Researcher
April 2017 - July 2020

My research was concerned on finding effective methods for creating more training data for the task of Semantic Role Labeling (SRL) in several languages (but particularly German). I approach SRL as a sequence classification task and also as a sequence generation task. Through my research I have worked with Recurrent Neural Networks , LSTM's , Sequence-to-Sequence models and multilingual Neural Language Models such as ELMo and BERT. I implemented my research code in Pytorch and also used state-of-the-art frameworks such as SpaCy , Transformers and AllenNLP when building more complex models.

Learn more

Education

Universität Heidelberg

Doctor of Philosophy

Computational Linguistics / Natural Language Processing

Thesis: Cross-lingual Semantic Role Labeling through Translation and Multilingual Learning

Supervisor: Prof. Dr. Anette Frank

Instituto Politécnico Nacional - Centro de Investigación en Computación (CIC)

Master of Science

Computer Science

Thesis: Automatic Text Generation by Learning from Literary Structures

Supervisor: Prof. Dr. Hiram Calvo

Tecnológico de Monterrey

Bachelor of Science

Computer Systems Engineer

Specialization in Artificial Intelligence

Selected Publications

In the Context of Narrative, we Never Properly Defined the Concept of Valence

Boot P., Daza, A., Schnober, C., van Hage, W. (2024).

CHR 2024: Computational Humanities Research Conference

Aarhus, Denmark

Learn more

Choosing the Right Tool for You: Informed Evaluation of Text Analysis Tools

Daza, A., Fokkens, A. (2024).

CLARIN Annual Conference Proceedings (CLARIN 2024)

Barcelona, Spain

Learn more

Confidently Wrong: Exploring the Calibration and Expression of (Un)Certainty of Large Language Models in a Multilingual Setting

Krause, L., Tufa, W., Baez-Santamaria, S., Daza, A., Khurana, U., Vossen, P. (2023).

Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023)

Prague, Czech Republic

Learn more

Dealing with Abbreviations in the Slovenian Biographical Lexicon

Daza, A., Fokkens, A., Erjavec, T. (2022).

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)

Abu Dhabi, UAE

Learn more

Weisfeiler-Leman in the bamboo: Novel AMR graph metrics and a benchmark for AMR graph similarity

Opitz, J., Daza, A., Frank, A. (2021)

Transactions of the Association for Computational Linguistics

TACL Journal

Learn more

X-SRL: A Parallel Cross-Lingual Dataset for Semantic Role Labeling

Daza, A. and Frank, A. (2020).

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

Online-Only

Learn more

Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling

Daza, A. and Frank, A. (2019).

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019)

Hong Kong, China

Learn more

A Sequence-to-Sequence Model for Semantic Role Labeling.

Daza, A. and Frank, A. (2018).

Proceedings of the 3rd Workshop on Representation Learning for NLP (RepL4NLP) - ACL 2018

Melbourne, Australia

Learn more

Automatic Text Generation by Learning from Literary Structures

Daza A., Calvo H., Figueroa-Nazuno J. (2016).

Proceedings of the Workshop on Computational Linguistics for Literature colocated with the North American Chapter of the Asociation of Computational Linguistics - NAACL 2016

San Diego, California

Learn more