research-article

Attention history-based attention for abstractive text summarization

Authors:

Jee-Hyong LeeAuthors Info & Claims

SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing

Pages 1075 - 1081

https://doi.org/10.1145/3341105.3373892

Published: 30 March 2020 Publication History

Abstract

Recently, encoder-decoder model using attention has shown meaningful results in the abstractive summarization tasks. In the attention mechanism, the attention distribution is generated based only on the current decoder state. However, since there are patterns in the process of writing summaries, patterns will exist even in the process of paying attention. In this work, we propose the attention history-based attention model that considers such patterns of the attention history. We build an additional recurrent network, the attention reader network to model the attention patterns. Also, we employ an accumulation vector that keeps the total amount of effective attention to each part of the input text, which is guided by an additional network named the accumulation network. Both the attention reader network and the accumulation vector are used as the additional inputs to the attention mechanism. The evaluation results on the CNN/Daily Mail dataset show that our method better captures the attention pattern and achieves higher ROUGE scores than strong baselines.

References

[1]

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.

[2]

Bahuleyan, H., Mou, L., Vechtomova, O., and Poupart, P. 2017. Variational Attention for Sequence-to-Sequence Models. In Association for the Advancement of Artificial Intelligence.

[3]

Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. 2003. A neural probabilistic language model. Journal of machine learning research, 3: 1137--1155.

Digital Library

[4]

Celikyilmaz, A., Bosselut, A., He, X., & Choi, Y. (2018, June). Deep Communicating Agents for Abstractive Summarization. In HLT-NAACL, pp. 1662--1675.

[5]

Chen, Y. C., and Bansal, M. 2018. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. In ACL.

[6]

L. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Empirical Methods in Natural Language Processing.

[7]

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Empirical Methods in Natural Language Processing.

[8]

Chopra, S., Auli, M., and Rush, A. M. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In HLT-NAACL, pp. 93--98.

[9]

Deng, Y., Kim, Y., Chiu, J., Guo, D., & Rush, A. 2018. Latent alignment and variational attention. In Advances in Neural Information Processing Systems pp. 9712--9724.

[10]

Genest, P. E., & Lapalme, G. 2012. Fully abstractive approach to guided summarization. In ACL, pp. 354--358.

[11]

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde Farley, D., Ozair, S., and Bengio, Y. 2014. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672--2680

[12]

Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8): 1735--1780.

[13]

Hsu, W. T., Lin, C. K., Lee, M. Y., Min, K., Tang, J., and Sun, M. 2018. A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss. In ACL.

[14]

Kingma, Diederik P., and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.

[15]

Knight, K., and Marcu, D. 2002. Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence, 139(1): 91--107.

Digital Library

[16]

Li, P., Lam, W., Bing, L., and Wang, Z. 2017. Deep Recurrent Generative Decoder for Abstractive Text Summarization. In Proceedings of Empirical Methods in Natural Language Processing.

[17]

Lin, Chin-Yew. 2004. Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out.

[18]

Liu, L., Lu, Y., Yang, M., Qu, Q., Zhu, J., and Li, H. 2018. Generative Adversarial Network for Abstractive Text Summarization. In Association for the Advancement of Artificial Intelligence.

[19]

Loper, E., and Bird, S. 2002. NLTK: The natural language toolkit. In Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics, pp. 63--70

[20]

Mi, H., Sankaran, B., Wang, Z., and Ittycheriah, A. 2016. Coverage embedding models for neural machine translation. In Proceedings of the Empirical Methods in Natural Language Processing, pp. 955--960

[21]

Nallapati, R., Zhou, B., Gulcehre, C., and Xiang, B. 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 280--290

[22]

Pasunuru, R., and Bansal, M. 2018. Multi-reward reinforced summarization with saliency and entailment. In HLT-NAACL.

[23]

Paulus, R., Xiong, C., and Socher, R. 2018. A Deep Reinforced Model for Abstractive Summarization. In ICLR.

[24]

Ranzato, M. A., Chopra, S., Auli, M., and Zaremba, W. 2016. Sequence level training with recurrent neural networks. In ICLR.

[25]

Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J., & Goel, V. 2017. Self-critical sequence training for image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7008--7024

[26]

Rush, A. M., Chopra, S., & Weston, J. 2015. A Neural Attention Model for Abstractive Sentence Summarization. In Proceedings of Empirical Methods in Natural Language Processing, pp. 379--389

[27]

See, Abigail, Peter J. Liu, and Christopher D. Manning. 2017. Get to The Point: Summarization with Pointer-Generator Networks. In ACL.

[28]

Sutskever, I., Vinyals, O., and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104--3112.

[29]

Tu, Z., Lu, Z., Liu, Y., Liu, X., and Li, H. 2016. Modeling coverage for neural machine translation. In ACL.

[30]

Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156--3164

[31]

Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8(3-4): 229--256.

[32]

Wu, L., Tian, F., Zhao, L., Lai, J., & Liu, T. Y. 2018. Word attention for sequence to sequence text understanding. In Thirty-Second AAAI Conference on Artificial Intelligence.

[33]

Yang, Z., Hu, Z., Deng, Y., Dyer, C., and Smola, A. 2017. Neural machine translation with recurrent attention modeling. In EACL.

[34]

Zhou, Q., Yang, N., Wei, F., and Zhou, M. 2017. Selective Encoding for Abstractive Sentence Summarization. In ACL.

Cited By

Rao AAithal SSingh S(2024)Single-Document Abstractive Text Summarization: A Systematic Literature ReviewACM Computing Surveys10.1145/370063957:3(1-37)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700639
Doan TNguyen PDi Rocco JDi Ruscio D(2023)Too long; didn’t read: Automatic summarization of GitHub README.MD with TransformersProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering10.1145/3593434.3593448(267-272)Online publication date: 14-Jun-2023
https://dl.acm.org/doi/10.1145/3593434.3593448
Yan JZhou S(2022)A Text Structure-based Extractive And Abstractive Summarization Method2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP)10.1109/ICSP54964.2022.9778497(678-681)Online publication date: 15-Apr-2022
https://doi.org/10.1109/ICSP54964.2022.9778497
Show More Cited By

Index Terms

Attention history-based attention for abstractive text summarization
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

Abstractive Text Summarization Using Enhanced Attention Model
Intelligent Human Computer Interaction
Abstract
Text summarization is the technique for generating a concise and precise summary of voluminous texts while focusing on the sections that convey useful information, and without losing the overall meaning. Although recent works are paying attentions ...
Novel multi‐domain attention for abstractive summarisation
Abstract
The existing abstractive text summarisation models only consider the word sequence correlations between the source document and the reference summary, and the summary generated by models lacks the cover of the subject of source document due to ...
Single-Document Abstractive Text Summarization: A Systematic Literature Review
Abstractive text summarization is a task in natural language processing that automatically generates the summary from the source document in a human-written form with minimal loss of information. Research in text summarization has shifted towards ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing

March 2020

2348 pages

ISBN:9781450368667

DOI:10.1145/3341105

Conference Chairs:
Chih-Cheng Hung
Kennesaw State University
,
Tomas Cerny
Baylor University
,
Program Chairs:
Dongwan Shin
New Mexico Tech
,
Alessio Bechini
University of Pisa, Italy

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Korea government(MSIT)
Ministry of Science ICT

Conference

SAC '20

Sponsor:

SIGAPP

SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing

March 30 - April 3, 2020

Brno, Czech Republic

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25

Sponsor:
sigapp

The 40th ACM/SIGAPP Symposium on Applied Computing

March 31 - April 4, 2025

Catania , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
351
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)1

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rao AAithal SSingh S(2024)Single-Document Abstractive Text Summarization: A Systematic Literature ReviewACM Computing Surveys10.1145/370063957:3(1-37)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700639
Doan TNguyen PDi Rocco JDi Ruscio D(2023)Too long; didn’t read: Automatic summarization of GitHub README.MD with TransformersProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering10.1145/3593434.3593448(267-272)Online publication date: 14-Jun-2023
https://dl.acm.org/doi/10.1145/3593434.3593448
Yan JZhou S(2022)A Text Structure-based Extractive And Abstractive Summarization Method2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP)10.1109/ICSP54964.2022.9778497(678-681)Online publication date: 15-Apr-2022
https://doi.org/10.1109/ICSP54964.2022.9778497
Chen YWang YLiu XHuang J(2022)Short-term load forecasting for industrial users based on Transformer-LSTM hybrid model2022 IEEE 5th International Electrical and Energy Conference (CIEEC)10.1109/CIEEC54735.2022.9846658(2470-2475)Online publication date: 27-May-2022
https://doi.org/10.1109/CIEEC54735.2022.9846658
Widyassari ARustad SShidik GNoersasongko ESyukur AAffandy ASetiadi D(2022)Review of automatic text summarization techniques & methodsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2020.05.00634:4(1029-1046)Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1016/j.jksuci.2020.05.006

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents