skip to main content
10.1145/3341105.3373892acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Attention history-based attention for abstractive text summarization

Published: 30 March 2020 Publication History

Abstract

Recently, encoder-decoder model using attention has shown meaningful results in the abstractive summarization tasks. In the attention mechanism, the attention distribution is generated based only on the current decoder state. However, since there are patterns in the process of writing summaries, patterns will exist even in the process of paying attention. In this work, we propose the attention history-based attention model that considers such patterns of the attention history. We build an additional recurrent network, the attention reader network to model the attention patterns. Also, we employ an accumulation vector that keeps the total amount of effective attention to each part of the input text, which is guided by an additional network named the accumulation network. Both the attention reader network and the accumulation vector are used as the additional inputs to the attention mechanism. The evaluation results on the CNN/Daily Mail dataset show that our method better captures the attention pattern and achieves higher ROUGE scores than strong baselines.

References

[1]
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.
[2]
Bahuleyan, H., Mou, L., Vechtomova, O., and Poupart, P. 2017. Variational Attention for Sequence-to-Sequence Models. In Association for the Advancement of Artificial Intelligence.
[3]
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. 2003. A neural probabilistic language model. Journal of machine learning research, 3: 1137--1155.
[4]
Celikyilmaz, A., Bosselut, A., He, X., & Choi, Y. (2018, June). Deep Communicating Agents for Abstractive Summarization. In HLT-NAACL, pp. 1662--1675.
[5]
Chen, Y. C., and Bansal, M. 2018. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. In ACL.
[6]
L. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Empirical Methods in Natural Language Processing.
[7]
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Empirical Methods in Natural Language Processing.
[8]
Chopra, S., Auli, M., and Rush, A. M. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In HLT-NAACL, pp. 93--98.
[9]
Deng, Y., Kim, Y., Chiu, J., Guo, D., & Rush, A. 2018. Latent alignment and variational attention. In Advances in Neural Information Processing Systems pp. 9712--9724.
[10]
Genest, P. E., & Lapalme, G. 2012. Fully abstractive approach to guided summarization. In ACL, pp. 354--358.
[11]
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde Farley, D., Ozair, S., and Bengio, Y. 2014. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672--2680
[12]
Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8): 1735--1780.
[13]
Hsu, W. T., Lin, C. K., Lee, M. Y., Min, K., Tang, J., and Sun, M. 2018. A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss. In ACL.
[14]
Kingma, Diederik P., and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.
[15]
Knight, K., and Marcu, D. 2002. Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence, 139(1): 91--107.
[16]
Li, P., Lam, W., Bing, L., and Wang, Z. 2017. Deep Recurrent Generative Decoder for Abstractive Text Summarization. In Proceedings of Empirical Methods in Natural Language Processing.
[17]
Lin, Chin-Yew. 2004. Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out.
[18]
Liu, L., Lu, Y., Yang, M., Qu, Q., Zhu, J., and Li, H. 2018. Generative Adversarial Network for Abstractive Text Summarization. In Association for the Advancement of Artificial Intelligence.
[19]
Loper, E., and Bird, S. 2002. NLTK: The natural language toolkit. In Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics, pp. 63--70
[20]
Mi, H., Sankaran, B., Wang, Z., and Ittycheriah, A. 2016. Coverage embedding models for neural machine translation. In Proceedings of the Empirical Methods in Natural Language Processing, pp. 955--960
[21]
Nallapati, R., Zhou, B., Gulcehre, C., and Xiang, B. 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 280--290
[22]
Pasunuru, R., and Bansal, M. 2018. Multi-reward reinforced summarization with saliency and entailment. In HLT-NAACL.
[23]
Paulus, R., Xiong, C., and Socher, R. 2018. A Deep Reinforced Model for Abstractive Summarization. In ICLR.
[24]
Ranzato, M. A., Chopra, S., Auli, M., and Zaremba, W. 2016. Sequence level training with recurrent neural networks. In ICLR.
[25]
Rennie, S. J., Marcheret, E., Mroueh, Y., Ross, J., & Goel, V. 2017. Self-critical sequence training for image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7008--7024
[26]
Rush, A. M., Chopra, S., & Weston, J. 2015. A Neural Attention Model for Abstractive Sentence Summarization. In Proceedings of Empirical Methods in Natural Language Processing, pp. 379--389
[27]
See, Abigail, Peter J. Liu, and Christopher D. Manning. 2017. Get to The Point: Summarization with Pointer-Generator Networks. In ACL.
[28]
Sutskever, I., Vinyals, O., and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104--3112.
[29]
Tu, Z., Lu, Z., Liu, Y., Liu, X., and Li, H. 2016. Modeling coverage for neural machine translation. In ACL.
[30]
Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156--3164
[31]
Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8(3-4): 229--256.
[32]
Wu, L., Tian, F., Zhao, L., Lai, J., & Liu, T. Y. 2018. Word attention for sequence to sequence text understanding. In Thirty-Second AAAI Conference on Artificial Intelligence.
[33]
Yang, Z., Hu, Z., Deng, Y., Dyer, C., and Smola, A. 2017. Neural machine translation with recurrent attention modeling. In EACL.
[34]
Zhou, Q., Yang, N., Wei, F., and Zhou, M. 2017. Selective Encoding for Abstractive Sentence Summarization. In ACL.

Cited By

View all

Index Terms

  1. Attention history-based attention for abstractive text summarization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
    March 2020
    2348 pages
    ISBN:9781450368667
    DOI:10.1145/3341105
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. abstractive text summarization
    2. accumulation-based attention
    3. attention mechanism
    4. pointer mechanism

    Qualifiers

    • Research-article

    Funding Sources

    • Korea government(MSIT)
    • Ministry of Science ICT

    Conference

    SAC '20
    Sponsor:
    SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing
    March 30 - April 3, 2020
    Brno, Czech Republic

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Upcoming Conference

    SAC '25
    The 40th ACM/SIGAPP Symposium on Applied Computing
    March 31 - April 4, 2025
    Catania , Italy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media