Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 128 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 75 tok/s Pro
Kimi K2 189 tok/s Pro
GPT OSS 120B 432 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Unsupervised Recurrent Neural Network Grammars (1904.03746v6)

Published 7 Apr 2019 in cs.CL and stat.ML

Abstract: Recurrent neural network grammars (RNNG) are generative models of language which jointly model syntax and surface structure by incrementally generating a syntax tree and sentence in a top-down, left-to-right order. Supervised RNNGs achieve strong language modeling and parsing performance, but require an annotated corpus of parse trees. In this work, we experiment with unsupervised learning of RNNGs. Since directly marginalizing over the space of latent trees is intractable, we instead apply amortized variational inference. To maximize the evidence lower bound, we develop an inference network parameterized as a neural CRF constituency parser. On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese. On constituency grammar induction, they are competitive with recent neural LLMs that induce tree structures from words through attention mechanisms.

Citations (114)

Summary

  • The paper introduces an unsupervised model for RNNGs that leverages amortized variational inference to address the challenges of latent tree space marginalization.
  • It employs a neural CRF-based inference network to balance structural bias with robust language modeling performance evaluated on English and Chinese benchmarks.
  • The research demonstrates that reducing reliance on annotated data can maintain competitive results in language modeling while advancing unsupervised syntactic analysis.

An Analysis of Unsupervised Recurrent Neural Network Grammars

The presented research explores an unsupervised approach to Recurrent Neural Network Grammars (RNNG), a domain previously dominated by supervised methods. RNNGs are models that integrate the generation of syntax trees and surface structures of language through incremental processes. Historically, the implementation of RNNGs has been tied to annotated parse trees, which limit their application due to the dependency on structured, labeled data. This paper, however, adopts unsupervised techniques, providing a novel approach by leveraging amortized variational inference to handle the intractability of latent tree space marginalization.

Key Contributions and Methodology

The key contribution of this work is the development of an unsupervised model for learning RNNGs, wherein the challenge of tractable marginalization over latent tree space is addressed through amortized variational inference. The inference network employed is structured as a neural Conditional Random Field (CRF) constituency parser, striking a balance between introducing structural bias and maintaining strong language modeling performance. The evidence lower bound (ELBO) is maximized to evaluate language modeling capability, with unsupervised RNNGs demonstrating competitive performance against their supervised counterparts.

The methodological innovation lies in parameterizing the inference network and the generative model using neural networks, including LSTM-based structures. The inference network introduces inductive biases by constraining tree structures, while the joint distribution over sentences and parse trees is modeled to incorporate dependency on the entire sequence of previous actions.

Results

Empirical evaluations reveal that the proposed unsupervised RNNGs achieve language modeling performance comparable to supervised RNNGs across both English and Chinese benchmarks. This impressive result holds despite not using labeled data, showcasing the potential of unsupervised approaches for complex tasks traditionally reliant on supervised methodologies.

Furthermore, in constituency grammar induction tasks, unsupervised RNNGs demonstrate competitive performance with modern neural LLMs that inherently induce tree structures. The results assert the model's ability to assign high likelihoods to held-out data while simultaneously deriving meaningful linguistic structures.

Implications and Future Directions

This work implies a significant leap toward reducing dependency on annotated data for training sophisticated LLMs. The success of amortized variational inference in unsupervised RNNGs opens avenues for expansion beyond language modeling, potentially influencing fields like natural language understanding and machine translation.

A foreseeable advancement involves refining the performance on longer sequences and refining the generative modeling of syntactic structures without supervision. The hybrid approach, demonstrated through fine-tuning unsupervised methods with supervised models, presents an innovative strategy that could advance the development of integrated systems combining labeled and unlabeled data.

Overall, this paper is a formidable step towards reshaping LLM training paradigms, emphasizing unsupervised learning's viability and effectiveness in complex syntactic structures. It sets a precedent for future research aiming at unsupervised learning of hierarchical LLMs and provides a scaffold for expanding the utility of RNNGs in diverse linguistic applications.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 8 likes.

Upgrade to Pro to view all of the tweets about this paper: