Leveraging Discourse Structure for Extractive Meeting Summarization (2405.11055v3)
Abstract: We introduce an extractive summarization system for meetings that leverages discourse structure to better identify salient information from complex multi-party discussions. Using discourse graphs to represent semantic relations between the contents of utterances in a meeting, we train a GNN-based node classification model to select the most important utterances, which are then combined to create an extractive summary. Experimental results on AMI and ICSI demonstrate that our approach surpasses existing text-based and graph-based extractive summarization systems, as measured by both classification and summarization metrics. Additionally, we conduct ablation studies on discourse structure and relation type to provide insights for future NLP applications leveraging discourse analysis theory.
- Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In international conference on machine learning, pages 21–29. PMLR.
- James Allen and Mark Core. 1997. Draft of DAMSL: Dialog act markup in several layers.
- Nicholas Asher. 1993. Reference to abstract objects in english.
- Discourse structure and dialogue acts in multiparty dialogue: the STAC corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 2721–2727, Portorož, Slovenia. European Language Resources Association (ELRA).
- A simple but effective model for attachment in discourse parsing with multi-task learning for relation labeling. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3412–3417, Dubrovnik, Croatia. Association for Computational Linguistics.
- Faithful to the original: Fact aware neural abstractive summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
- Improving dialogue discourse parsing via reply-to structures of addressee recognition. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 8484–8495, Singapore. Association for Computational Linguistics.
- Dialogue discourse-aware graph model and data augmentation for meeting summarization. In Proceeding of The 30th International Joint Conference on Artificial Intelligence.
- Chih-Wen Goo and Yun-Nung Chen. 2018. Abstractive dialogue summarization with sentence-gated modeling optimized by dialogue acts. In 2018 IEEE Spoken Language Technology Workshop (SLT), pages 735–742. IEEE.
- MeetingBank: A benchmark dataset for meeting summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16409–16423, Toronto, Canada. Association for Computational Linguistics.
- The ICSI meeting corpus. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03)., volume 1, pages I–I. IEEE.
- Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual. Institute of Cognitive Science Technical Report.
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
- What’s under the hood: Investigating automatic metrics on meeting summarization. arXiv preprint arXiv:2404.11124.
- Danielle Kost. 2020. You’re right! You are working longer and attending more meetings. Harvard Business School Working Knowledge.
- Generating SOAP notes from doctor-patient conversations using modular summarization techniques. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4958–4972, Online. Association for Computational Linguistics.
- Alex Lascarides and Nicholas Asher. 2008. Segmented discourse representation theory: Dynamic semantics with discourse structure. In Computing meaning, pages 87–124. Springer.
- Molweni: A challenge multiparty dialogues-based machine reading comprehension dataset with discourse structure. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2642–2652, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- Automatic dialogue summary generation for customer service. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1957–1965.
- Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics, 12:157–173.
- Yang Liu and Mirella Lapata. 2019. Text summarization with pretrained encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3730–3740, Hong Kong, China. Association for Computational Linguistics.
- Zhengyuan Liu and Nancy Chen. 2019. Exploiting discourse-level segmentation for extractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 116–121, Hong Kong, China. Association for Computational Linguistics.
- Zhengyuan Liu and Nancy Chen. 2021. Improving multi-party dialogue discourse parsing via domain integration. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 122–127, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
- William C. Mann and Sandra A. Thompson. 1987. Rhetorical structure theory: A framework for the analysis of texts. Technical report, University of Southern California Marina Del Rey Information Sciences Inst.
- The AMI meeting corpus. Int’l. Conf. on Methods and Techniques in Behavioral Research.
- Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 404–411, Barcelona, Spain. Association for Computational Linguistics.
- Generating and validating abstracts of meeting conversations: a user study. In Proceedings of the 6th International Natural Language Generation Conference. Association for Computational Linguistics.
- Extractive summarization of meeting recordings. In Interspeech.
- ELITR minuting corpus: A novel dataset for automatic minuting from multi-party meetings in English and Czech. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3174–3182, Marseille, France. European Language Resources Association.
- A template-based abstractive meeting summarization: Leveraging summary and source text relationships. In Proceedings of the 8th International Natural Language Generation Conference (INLG), pages 45–53, Philadelphia, Pennsylvania, U.S.A. Association for Computational Linguistics.
- On context utilization in summarization with large language models. arXiv e-prints, pages arXiv–2310.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
- FREDSum: A dialogue summarization corpus for French political debates. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 4241–4253, Singapore. Association for Computational Linguistics.
- Abstractive meeting summarization: A survey. Transactions of the Association for Computational Linguistics, 11:861–884.
- Modeling relational data with graph convolutional networks. In European semantic web conference, pages 593–607. Springer.
- Guokan Shang. 2021. Spoken Language Understanding for Abstractive Meeting Summarization. Ph.D. thesis, Institut Polytechnique de Paris.
- Unsupervised abstractive meeting summarization with multi-sentence compression and budgeted submodular maximization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 664–674, Melbourne, Australia. Association for Computational Linguistics.
- Energy-based self-attentive learning of abstractive communities for spoken language understanding. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 313–327, Suzhou, China. Association for Computational Linguistics.
- Weisfeiler-lehman graph kernels. J. Mach. Learn. Res., 12(null):2539–2561.
- Zhouxing Shi and Minlie Huang. 2019. A deep sequential model for discourse parsing on multi-party dialogues. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7007–7014.
- Grakel: A graph kernel library in python. Journal of Machine Learning Research, 21(54):1–5.
- Combining graph degeneracy and submodularity for unsupervised extractive summarization. In Proceedings of the Workshop on New Frontiers in Summarization, pages 48–58, Copenhagen, Denmark. Association for Computational Linguistics.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- A structure self-aware model for discourse parsing on multi-party dialogues. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 3943–3949. International Joint Conferences on Artificial Intelligence Organization. Main Track.
- Vcsum: A versatile chinese meeting summarization dataset. arXiv preprint arXiv:2305.05280.
- Discourse-aware neural extractive text summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5021–5031, Online. Association for Computational Linguistics.
- A joint model for dropped pronoun recovery and conversational discourse parsing in chinese conversational speech. arXiv preprint arXiv:2106.03345.
- Extractive is not faithful: An investigation of broad unfaithfulness problems in extractive summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2153–2174, Toronto, Canada. Association for Computational Linguistics.
- BERTScore: Evaluating text generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.
- DialogLM: Pre-trained model for long dialogue understanding and summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11765–11773.