Loading paper
Elixir: Train a Large Language Model on a Small GPU Cluster | Tomesphere