research-article

Free access

Deep learning for NLP (without magic)

Authors:

Richard Socher,

Yoshua Bengio,

Christopher D. ManningAuthors Info & Claims

ACL '12: Tutorial Abstracts of ACL 2012

Page 5

Published: 08 July 2012 Publication History

PDF eReader

Abstract

Machine learning is everywhere in today's NLP, but by and large machine learning amounts to numerical optimization of weights for human designed representations and features. The goal of deep learning is to explore how computers can take advantage of data to develop features and representations appropriate for complex interpretation tasks. This tutorial aims to cover the basic motivation, ideas, models and learning algorithms in deep learning for natural language processing. Recently, these methods have been shown to perform very well on various NLP tasks such as language modeling, POS tagging, named entity recognition, sentiment analysis and paraphrase detection, among others. The most attractive quality of these techniques is that they can perform well without any external hand-designed resources or time-intensive feature engineering. Despite these advantages, many researchers in NLP are not familiar with these methods. Our focus is on insight and understanding, using graphical illustrations and simple, intuitive derivations. The goal of the tutorial is to make the inner workings of these techniques transparent, intuitive and their results interpretable, rather than black boxes labeled "magic here".

The first part of the tutorial presents the basics of neural networks, neural word vectors, several simple models based on local windows and the math and algorithms of training via backpropagation. In this section applications include language modeling and POS tagging.

In the second section we present recursive neural networks which can learn structured tree outputs as well as vector representations for phrases and sentences. We cover both equations as well as applications. We show how training can be achieved by a modified version of the backpropagation algorithm introduced before. These modifications allow the algorithm to work on tree structures. Applications include sentiment analysis and paraphrase detection. We also draw connections to recent work in semantic compositionality in vector spaces. The principle goal, again, is to make these methods appear intuitive and interpretable rather than mathematically confusing. By this point in the tutorial, the audience members should have a clear understanding of how to build a deep learning system for word-, sentence- and document-level tasks.

The last part of the tutorial gives a general overview of the different applications of deep learning in NLP, including bag of words models. We will provide a discussion of NLP-oriented issues in modeling, interpretation, representational power, and optimization.

Cited By

View all

Wang KShen CZhang CMa W(2022)AdnFM: An Attentive DenseNet based Factorization Machine for Click-Through-Rate PredictionProceedings of the 2022 8th International Conference on Computing and Data Engineering10.1145/3512850.3512852(1-7)Online publication date: 11-Jan-2022
https://dl.acm.org/doi/10.1145/3512850.3512852
Shao JHu KWang CXue XRaj BLarochelle HRanzato MHadsell RBalcan MLin H(2020)Is normalization indispensable for training deep neural networks?Proceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496851(13434-13444)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3496851
Shanthakumar VBanerjee CMukherjee TPasiliao E(2020)Uncooperative RF Direction Finding with I/Q DataProceedings of the 2020 the 4th International Conference on Information System and Data Mining10.1145/3404663.3404668(6-13)Online publication date: 15-May-2020
https://dl.acm.org/doi/10.1145/3404663.3404668
Show More Cited By

Recommendations

IR meets NLP: On the Semantic Similarity between Subject-Verb-Object Phrases
ICTIR '15: Proceedings of the 2015 International Conference on The Theory of Information Retrieval

Measuring the semantic similarity between phrases and sentences is an important task in natural language processing (NLP) and information retrieval (IR). We compare the quality of the distributional semantic NLP models against phrase-based semantic IR. ...
A Neural NLP toolkit for Greek
SETN 2020: 11th Hellenic Conference on Artificial Intelligence

We present a neural NLP toolkit for Greek, currently integrating modules for POS tagging, lemmatization, dependency parsing and text classification. The toolkit is based on language resources including web crawled corpora, word embeddings, large lexica, ...
Deep Learning Approach for Automatic Romanian Lemmatization
Abstract
This paper proposes a deep learning sequence-to-sequence approach to improve the task of automatic Romanian lemmatization. The study compares 24 systems using different combinations of recurrent, convolutional and attention layers, while the text ...

Comments

Information & Contributors

Information

Published In

ACL '12: Tutorial Abstracts of ACL 2012

July 2012

16 pages

Conference Chair:
Michael Strube
Heidelberg Institute for Theoretical Studies, gGmbH, Heidelberg, Germany

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 08 July 2012

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
6,954
Total Downloads

Downloads (Last 12 months)360
Downloads (Last 6 weeks)73

Reflects downloads up to 07 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wang KShen CZhang CMa W(2022)AdnFM: An Attentive DenseNet based Factorization Machine for Click-Through-Rate PredictionProceedings of the 2022 8th International Conference on Computing and Data Engineering10.1145/3512850.3512852(1-7)Online publication date: 11-Jan-2022
https://dl.acm.org/doi/10.1145/3512850.3512852
Shao JHu KWang CXue XRaj BLarochelle HRanzato MHadsell RBalcan MLin H(2020)Is normalization indispensable for training deep neural networks?Proceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496851(13434-13444)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3496851
Shanthakumar VBanerjee CMukherjee TPasiliao E(2020)Uncooperative RF Direction Finding with I/Q DataProceedings of the 2020 the 4th International Conference on Information System and Data Mining10.1145/3404663.3404668(6-13)Online publication date: 15-May-2020
https://dl.acm.org/doi/10.1145/3404663.3404668
Nafi KKar TRoy BRoy CSchneider KZimmermann TLawall JMarinov D(2019)CLCDSAProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00099(1026-1037)Online publication date: 10-Nov-2019
https://dl.acm.org/doi/10.1109/ASE.2019.00099
Li CLiu LJiang F(2018)Intelligent Question Answering Model Based on CN-BiLSTMProceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence10.1145/3297156.3297261(447-450)Online publication date: 8-Dec-2018
https://dl.acm.org/doi/10.1145/3297156.3297261
Guo XLiu JShi CLiu HChen YChuah M(2018)Device-free Personalized Fitness Assistant Using WiFiProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/32870432:4(1-23)Online publication date: 27-Dec-2018
https://dl.acm.org/doi/10.1145/3287043
Chen XZhang YXu HQin ZZha H(2018)Adversarial Distillation for Efficient Recommendation with External KnowledgeACM Transactions on Information Systems10.1145/328165937:1(1-28)Online publication date: 13-Dec-2018
https://dl.acm.org/doi/10.1145/3281659
Saini VFarmahinifarahani FLu YBaldi PLopes CLeavens GGarcia APăsăreanu C(2018)Oreo: detection of clones in the twilight zoneProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236026(354-365)Online publication date: 26-Oct-2018
https://dl.acm.org/doi/10.1145/3236024.3236026
Zhang YXiong YKong XLi SMi JZhu YChampin PGandon FMédini LLalmas MIpeirotis P(2018)Deep Collective Classification in Heterogeneous Information NetworksProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186106(399-408)Online publication date: 10-Apr-2018
https://dl.acm.org/doi/10.1145/3178876.3186106
Hong DChen QMao ZKim TWang CKim TWu D(2017)An Initial Investigation of Protocol CustomizationProceedings of the 2017 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3141235.3141236(57-64)Online publication date: 3-Nov-2017
https://dl.acm.org/doi/10.1145/3141235.3141236
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Recommendations

IR meets NLP: On the Semantic Similarity between Subject-Verb-Object Phrases

A Neural NLP toolkit for Greek

Deep Learning Approach for Automatic Romanian Lemmatization

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations