Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence (2110.15931v1)

Published 29 Oct 2021 in cs.CL

Abstract: Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new state-of-the-art for unlabeled F1 and strong baselines for labeled F1. In contrast with the conventional predicting-and-evaluating scenario, our method acts as an plausible example to inversely apply evaluating metrics for prediction.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Letian Peng (23 papers)
  2. Zuchao Li (76 papers)
  3. Hai Zhao (227 papers)

Summary

We haven't generated a summary for this paper yet.