Designing Large Foundation Models for Efficient Training and Inference:   A Survey

Dong Liu; Yanxuan Yu; Yite Wang; Jing Wu; Zhongwei Wan; Sina Alinejad,; Benjamin Lengerich; Ying Nian Wu

arXiv:2409.01990·cs.DC·April 15, 2025·5 cites

Designing Large Foundation Models for Efficient Training and Inference: A Survey

Dong Liu, Yanxuan Yu, Yite Wang, Jing Wu, Zhongwei Wan, Sina Alinejad,, Benjamin Lengerich, Ying Nian Wu

PDF

Open Access 1 Repo

TL;DR

This survey reviews modern techniques for efficient training and inference of large foundation models, emphasizing model and system design to reduce computational costs and improve accessibility.

Contribution

It provides a comprehensive overview of current methods in model and system design for efficient foundation models, including a curated repository.

Findings

01

Highlights key techniques for efficient LLM training and inference

02

Identifies challenges and future directions in model and system optimization

03

Provides a resource repository for further research

Abstract

This paper focuses on modern efficient training and inference technologies on foundation models and illustrates them from two perspectives: model and system design. Model and System Design optimize LLM training and inference from different aspects to save computational resources, making LLMs more efficient, affordable, and more accessible. The paper list repository is available at https://github.com/NoakLiu/Efficient-Foundation-Models-Survey.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

noakliu/efficient-foundation-models-survey
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus