Loading paper
Layer-Parallel Training for Transformers | Tomesphere