Loading paper
ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism | Tomesphere