TL;DR
This paper introduces Pruning In Time (PIT), an automatic dilation optimizer for Temporal Convolutional Networks that prunes weights on the time-axis, reducing model size and latency without accuracy loss.
Contribution
It presents a novel method that learns dilation factors and prunes weights simultaneously, producing efficient TCNs with Pareto-optimal trade-offs in size and accuracy.
Findings
Model size reduced by up to 7.4x on hardware
Inference latency decreased by up to 3x
Outperforms hand-designed solutions in size and accuracy
Abstract
Temporal Convolutional Networks (TCNs) are promising Deep Learning models for time-series processing tasks. One key feature of TCNs is time-dilated convolution, whose optimization requires extensive experimentation. We propose an automatic dilation optimizer, which tackles the problem as a weight pruning on the time-axis, and learns dilation factors together with weights, in a single training. Our method reduces the model size and inference latency on a real SoC hardware target by up to 7.4x and 3x, respectively with no accuracy drop compared to a network without dilation. It also yields a rich set of Pareto-optimal TCNs starting from a single model, outperforming hand-designed solutions in both size and accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
