Guiding Symbolic Natural Language Grammar Induction via Transformer-Based Sequence Probabilities
Ben Goertzel, Andres Suarez Madrigal, Gino Yu

TL;DR
This paper introduces a method that leverages transformer-based sequence probabilities to enhance symbolic grammar induction in natural language processing, enabling more adaptable and knowledge-driven rule learning.
Contribution
It presents a novel technique that uses transformer language models to guide symbolic grammar induction without relying on their internal representations.
Findings
Demonstrated the method with a proof-of-concept example
Successfully guided unsupervised link-grammar induction
Showed adaptability to evolving language models
Abstract
A novel approach to automated learning of syntactic rules governing natural languages is proposed, based on using probabilities assigned to sentences (and potentially longer word sequences) by transformer neural network language models to guide symbolic learning processes like clustering and rule induction. This method exploits the learned linguistic knowledge in transformers, without any reference to their inner representations; hence, the technique is readily adaptable to the continuous appearance of more powerful language models. We show a proof-of-concept example of our proposed technique, using it to guide unsupervised symbolic link-grammar induction methods drawn from our prior research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding
