Loading paper
Speech-text based multi-modal training with bidirectional attention for improved speech recognition | Tomesphere