
TL;DR
This paper introduces Deductron, a novel 3-layer recurrent neural network designed for long-term sequence decoding, constructed through interpretive methods and trained with simulated annealing and SGD variants.
Contribution
The paper presents a new RNN architecture called Deductron, built by inspection and interpretive methods, capable of decoding long sequences requiring logical inference.
Findings
Deductron can be constructed without traditional training, using interpretive techniques.
It can be trained effectively with simulated annealing and stochastic gradient descent.
The architecture demonstrates the ability to decode sequences with long-term dependencies.
Abstract
The current paper is a study in Recurrent Neural Networks (RNN), motivated by the lack of examples simple enough so that they can be thoroughly understood theoretically, but complex enough to be realistic. We constructed an example of structured data, motivated by problems from image-to-text conversion (OCR), which requires long-term memory to decode. Our data is a simple writing system, encoding characters 'X' and 'O' as their upper halves, which is possible due to symmetry of the two characters. The characters can be connected, as in some languages using cursive, such as Arabic (abjad). The string 'XOOXXO' may be encoded as ''. It follows that we may need to know arbitrarily long past to decode a current character, thus requiring long-term memory. Subsequently we constructed an RNN capable of decoding sequences encoded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
