HuggingFace's Transformers: State-of-the-art Natural Language Processing
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond and, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R\'emi, Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von, Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven, Le Scao

TL;DR
HuggingFace's Transformers library provides a comprehensive, easy-to-use collection of state-of-the-art Transformer models for natural language processing, facilitating research, development, and deployment across the community.
Contribution
It introduces an open-source, unified API for Transformer architectures with a curated collection of pretrained models, enhancing accessibility and extensibility.
Findings
Provides a versatile library for NLP tasks
Includes a wide range of pretrained models
Enables efficient deployment in industry
Abstract
Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community. \textit{Transformers} is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments. The library is available at \url{https://github.com/huggingface/transformers}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗PJMixers-Images/Florence-2-base-Castollux-v0.5model· 505 dl· ♡ 5505 dl♡ 5
- 🤗qanastek/FrenchMedMCQA-BioBERT-V1.1-Wikipedia-BM25model· 3 dl· ♡ 13 dl♡ 1
- 🤗qanastek/FrenchMedMCQA-BART-base-Wikipedia-BM25model· 3 dl· ♡ 13 dl♡ 1
- 🤗Ian332/Helper_Bobmodel· ♡ 3♡ 3
- 🤗akswelh/NEOXmodel
- 🤗PJMixers-Images/Florence-2-base-Castollux-v0.1model· 2 dl2 dl
- 🤗PJMixers-Images/Florence-2-base-Castollux-v0.2model· 7 dl· ♡ 27 dl♡ 2
- 🤗PJMixers-Images/Florence-2-base-Castollux-v0.4model· 6 dl· ♡ 16 dl♡ 1
- 🤗PJMixers-Dev/Gemma-3-Earthen-Completion-v0.1-4B-QLoRAmodel· 1 dl1 dl
- 🤗PJMixers-Dev/Gemma-3-Earthen-Completion-v0.1-4Bmodel· 4 dl· ♡ 14 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
