Recurrent Deep Differentiable Logic Gate Networks
Simon B\"uhrer, Andreas Plesner, Till Aczel, Roger Wattenhofer

TL;DR
This paper introduces Recurrent Deep Differentiable Logic Gate Networks, integrating Boolean logic with recurrent neural architectures for sequence modeling, demonstrating promising translation performance and opening new research avenues.
Contribution
First implementation of recurrent differentiable logic gate networks, combining logic operations with recurrent structures for sequence-to-sequence tasks.
Findings
Achieves 5.00 BLEU on WMT'14 English-German translation
Demonstrates near-GRU performance with 30.9% accuracy during training
Shows potential for FPGA acceleration and recursive network architectures
Abstract
While differentiable logic gates have shown promise in feedforward networks, their application to sequential modeling remains unexplored. This paper presents the first implementation of Recurrent Deep Differentiable Logic Gate Networks (RDDLGN), combining Boolean operations with recurrent architectures for sequence-to-sequence learning. Evaluated on WMT'14 English-German translation, RDDLGN achieves 5.00 BLEU and 30.9\% accuracy during training, approaching GRU performance (5.41 BLEU) and graceful degradation (4.39 BLEU) during inference. This work establishes recurrent logic-based neural computation as viable, opening research directions for FPGA acceleration in sequential modeling and other recursive network architectures.
Peer Reviews
Decision·Submitted to ICLR 2026
not evaluated
not evaluated
The paper extends DDLGNs to sequence modeling with an encoder–decoder built from logic layers. The memory evaluation is interesting. The shifted-copy task shows RDDLGN maintains high accuracy at larger temporal offsets than RNN/GRU,
Baselines are quite small and non-standard for WMT14, 5-ish BLEU feel like garbage and not convincing it's actually a practical setup.
* Interesting novel architecture. * Good motivation. * Shifted monolingual prediction result is interesting. * Some good ablations in the appendix.
* The experiments seems flawed. The paper reports about 5 BLEU on WMT’14 English-German translation. Usual numbers on this task are about 30-35 BLEU. So this is completely broken? * Sequence lengths are way too short for reasonable realistic experiments. * Models are way too small. * Contradiction on vanishing gradients (see comments below). * Contradiction on long-sequence handling (see comments below). * Unclear parts.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLow-power high-performance VLSI design · VLSI and FPGA Design Techniques · VLSI and Analog Circuit Testing
