José Angel Daza
NLP Engineer
Randstad
550 volgers
Meer dan 500 connecties
Info
I am a Computer Scientist mainly interested in Natural Language Processing, Artificial Intelligence and Software Development. I have worked as a technology developer in the industry and also producing my own ideas. As a researcher, I have been working on finding novel methods for processing multilingual data and in the future would be happy to collaborate in more projects that involve exploring and applying algorithms in the NLP and AI fields.
Activiteit
-
✨ New year, new role! ✨ 😁 After my PhD research on detecting diverse perspectives in news recommendation, I was looking for a role where I could…
✨ New year, new role! ✨ 😁 After my PhD research on detecting diverse perspectives in news recommendation, I was looking for a role where I could…
Gemarkeerd als interessant door José Angel Daza
-
Over a decade ago, I went to DreamWorks with Carl Callewaert to showcase the Unity 5 rendering tech. Time passes so fast.
Over a decade ago, I went to DreamWorks with Carl Callewaert to showcase the Unity 5 rendering tech. Time passes so fast.
Gemarkeerd als interessant door José Angel Daza
-
Another year ends, but this one in particular was a great year. This year I finally obtained a permanent academic position (Tenure) at the…
Another year ends, but this one in particular was a great year. This year I finally obtained a permanent academic position (Tenure) at the…
Gemarkeerd als interessant door José Angel Daza
Ervaring
Opleiding
-
Heidelberg University
Doctor of Philosophy - PhD Computational Linguistics
-
Thesis: Cross-lingual Semantic Role Labeling through Translation and Multilingual Learning
-
Instituto Politécnico Nacional
Master of Science (MSc) Computer Science
-
Thesis: Automatic Text Generation by Learning from Literary Structures
-
Instituto Tecnológico y de Estudios Superiores de Monterrey / ITESM
Bachelor of Science (BSc) Computer Engineering
-
Licenties en certificaten
Publicaties
-
X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Corpus
EMNLP 2020
Even though SRL is researched for many languages, major improvements have mostly been obtained for English, for which more resources are available. In fact, existing multilingual SRL datasets contain disparate annotation styles or come from different domains, hampering generalization in multilingual learning. In this work we propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations…
Even though SRL is researched for many languages, major improvements have mostly been obtained for English, for which more resources are available. In fact, existing multilingual SRL datasets contain disparate annotation styles or come from different domains, hampering generalization in multilingual learning. In this work we propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations that are fully comparable across languages. We apply high-quality machine translation to the English CoNLL-09 dataset and use multilingual BERT to project its high-quality annotations to the target languages. We include human-validated test sets that we use to measure the projection quality, and show that projection is denser and more precise than a strong baseline. Finally, we train different SOTA models on our novel corpus for mono- and multilingual SRL, showing that the multi-lingual annotations improve performance especially for the weaker languages.
-
Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling
Conference on Empirical Methods in Natural Language Processing - EMNLP-IJCNLP 2019
-
A Sequence-to-Sequence Model for Semantic Role Labeling
Proceedings of The Third Workshop on Representation Learning for NLP. ACL 2018
We explore a novel approach for Semantic Role Labeling (SRL) by casting it as a sequence-to-sequence process. We employ an attention-based model enriched with a copying mechanism to ensure faithful regeneration of the input sequence, while enabling interleaved generation of argument role labels. Here, we apply this model in a monolingual setting, performing PropBank SRL on English language data. The constrained sequence generation set-up enforced with the copying mechanism allows us to analyze…
We explore a novel approach for Semantic Role Labeling (SRL) by casting it as a sequence-to-sequence process. We employ an attention-based model enriched with a copying mechanism to ensure faithful regeneration of the input sequence, while enabling interleaved generation of argument role labels. Here, we apply this model in a monolingual setting, performing PropBank SRL on English language data. The constrained sequence generation set-up enforced with the copying mechanism allows us to analyze the performance and special properties of the model on manually labeled data and benchmarking against state-of-the-art sequence labeling models. We show that our model is able to solve the SRL argument labeling task on English data, yet further structural decoding constraints will need to be added to make the model truly competitive. Our work represents a first step towards more advanced, generative SRL labeling setups.
-
Automatic Story Generation by Learning from Literary Structures
Scholars' Press
Are mind and machine capable of solving the same tasks? Creativity is one of the arguments that some philosophers and psychologists use as a proof of what computers cannot achieve; however, these arguments might be based on a misconception of what both intelligence and creativity mean. This book provides arguments supporting that creativity, as storytelling, can be emulated through computer programs. The assumption of creativity presents a major problem: Complexity. Even if we consider…
Are mind and machine capable of solving the same tasks? Creativity is one of the arguments that some philosophers and psychologists use as a proof of what computers cannot achieve; however, these arguments might be based on a misconception of what both intelligence and creativity mean. This book provides arguments supporting that creativity, as storytelling, can be emulated through computer programs. The assumption of creativity presents a major problem: Complexity. Even if we consider creativity just as a product of novel ways of achieving a goal, the number of combinations found when dealing with the ‘real world’ is astronomically huge. We can recall The Library of Babel (Borges, 1944), a library that contains any possible book that could be written in the history of humanity. This metaphor reveals the combinatory problem that emerges if a brute force algorithm is designed to generate texts. According to our hypothesis, our proposal is a heuristic that uses simple syntactic and semantic properties found in a text corpus in order to generate novel and coherent fiction texts based on what has been already written.
Andere auteursPublicatie weergeven -
Automatic Text Generation by Learning from Literary Structures
Proceedings of the Fifth Workshop on Computational Linguistics for Literature, NAACL-HLT 2016
Most of the work dealing with automatic story production is based on a generic architecture for text generation; however, the resulting stories still lack a style that can be called literary. We believe that in order to generate automatically stories that could be compared with those by human authors, a specific methodology for fiction text generation should be defined. We also believe that it is essential for a story to convey the effect of originality to the person who is reading it. Our…
Most of the work dealing with automatic story production is based on a generic architecture for text generation; however, the resulting stories still lack a style that can be called literary. We believe that in order to generate automatically stories that could be compared with those by human authors, a specific methodology for fiction text generation should be defined. We also believe that it is essential for a story to convey the effect of originality to the person who is reading it. Our methodology proposes corpus-based generation of stories that could be called creative and also have a style similar to human fiction texts. We also show how these stories have plausible syntax and coherence, and are perceived as interesting by human evaluators.
Andere auteursPublicatie weergeven
Cursussen
-
Algorithm Analysis and Design
-
-
Artificial Intelligence
-
-
Computer Systems Design
-
-
Discrete Mathematics
-
-
Natural Language Generation
-
-
Natural Language Processing
-
-
Pattern Recognition
-
-
Statistical Processing of Textual Information
-
-
Theory of Computation
-
Projecten
-
CuantoCobrar
- heden
-
InTaVia: Visual Analysis, Curation & Communication for In/Tangible European Heritage
-
The InTaVia knowledge graph contains data on Europe’s cultural history, including data on individual artists, cultural objects, and groups or organizations. Search for these entities in our knowledge base (with a focus on Slovenia, Austria, the Netherlands and Finland) or with a global reach via data from Wikipedia. You can also upload your own data, and curate (i.e., edit, assemble, or enrich) all kinds of data for further operations of visual analysis and narration.
Testscores
-
TOEFL-iBT
Score: 101
Talen
-
Spanish
Moedertaal of tweetalig
-
English
Volledige professionele vaardigheid
-
German
Basisvaardigheid
-
Portuguese
Beperkte werkvaardigheid
Meer activiteiten van José Angel
-
Hey everyone, I’d like to share an interview I gave to SwissGlobal Language Services AG about my PhD research. When I started exploring the impact…
Hey everyone, I’d like to share an interview I gave to SwissGlobal Language Services AG about my PhD research. When I started exploring the impact…
Gemarkeerd als interessant door José Angel Daza
-
We're going to launch Grassroots Science, a year-long ambitious, massive-scale, fully open-source initiative aimed at developing multilingual LLMs…
We're going to launch Grassroots Science, a year-long ambitious, massive-scale, fully open-source initiative aimed at developing multilingual LLMs…
Gemarkeerd als interessant door José Angel Daza
-
👁️👁️👁️ Convocatorias de posgrado 🧑🎓👩🎓👨🎓🎓 relacionadas a IA y Ciencia de Datos para comenzar en agosto que cierran en febrero 🗓️: *…
👁️👁️👁️ Convocatorias de posgrado 🧑🎓👩🎓👨🎓🎓 relacionadas a IA y Ciencia de Datos para comenzar en agosto que cierran en febrero 🗓️: *…
Gemarkeerd als interessant door José Angel Daza
-
Can high school students grasp fundamental ideas at the intersection of ethical and technical challenges of AI? In our new paper, “A workshop on…
Can high school students grasp fundamental ideas at the intersection of ethical and technical challenges of AI? In our new paper, “A workshop on…
Gemarkeerd als interessant door José Angel Daza
-
My PhD thesis is now available online at https://lnkd.in/e8ArNfJz I'm thankful with all my collaborators who joined me in this journey!
My PhD thesis is now available online at https://lnkd.in/e8ArNfJz I'm thankful with all my collaborators who joined me in this journey!
Gemarkeerd als interessant door José Angel Daza
-
✨ Nuevo artículo publicado en Mathematics, Q1 ✨ 📢 ¿Cómo superar los límites de los métodos tradicionales para clasificación de texto en datasets…
✨ Nuevo artículo publicado en Mathematics, Q1 ✨ 📢 ¿Cómo superar los límites de los métodos tradicionales para clasificación de texto en datasets…
Gemarkeerd als interessant door José Angel Daza
-
🚀 Proud to present the results of the QuWater Project and its major outcome—the WNTR Quantum package—at the Dutch Computational Science (DUCOMS)…
🚀 Proud to present the results of the QuWater Project and its major outcome—the WNTR Quantum package—at the Dutch Computational Science (DUCOMS)…
Gemarkeerd als interessant door José Angel Daza
-
Problem: There are thousands of language models (LMs) on 🤗 HuggingFace - but which one is the best for your NLP task? Solution: Use ⚖️…
Problem: There are thousands of language models (LMs) on 🤗 HuggingFace - but which one is the best for your NLP task? Solution: Use ⚖️…
Gemarkeerd als interessant door José Angel Daza
-
Today I gave an invited talk at Microsoft Research (MSR) on “Faithful Reasoning with LLMs”! A big thank you to Niket Tandon for the invitation and…
Today I gave an invited talk at Microsoft Research (MSR) on “Faithful Reasoning with LLMs”! A big thank you to Niket Tandon for the invitation and…
Gemarkeerd als interessant door José Angel Daza
-
The new DIANNA release (v 1.7) is here! Deep Insight And Neural Network Analysis (DIANNA) is *the* Open Source Explainable AI (XAI) Python library…
The new DIANNA release (v 1.7) is here! Deep Insight And Neural Network Analysis (DIANNA) is *the* Open Source Explainable AI (XAI) Python library…
Gemarkeerd als interessant door José Angel Daza
Overige vergelijkbare profielen
Anderen hebben José Angel Daza genoemd
26 anderen door wie José Angel Daza is genoemd, gebruiken LinkedIn
Bekijk anderen die José Angel Daza heten