DeepSeek-OCR: Contexts Optical Compression
Haoran Wei, Yaofeng Sun, Yukun Li

TL;DR
DeepSeek-OCR introduces a novel optical compression method for long text contexts, enabling high OCR accuracy at significant compression ratios and demonstrating practical advantages in large-scale document processing.
Contribution
The paper presents DeepSeek-OCR, a new optical compression framework that maintains high OCR accuracy with high compression ratios, advancing long-context processing in vision-language models.
Findings
Achieves 97% OCR accuracy at <10x compression ratio.
Maintains ~60% accuracy at 20x compression.
Outperforms existing OCR methods on benchmark datasets.
Abstract
We present DeepSeek-OCR as an initial investigation into the feasibility of compressing long contexts via optical 2D mapping. DeepSeek-OCR consists of two components: DeepEncoder and DeepSeek3B-MoE-A570M as the decoder. Specifically, DeepEncoder serves as the core engine, designed to maintain low activations under high-resolution input while achieving high compression ratios to ensure an optimal and manageable number of vision tokens. Experiments show that when the number of text tokens is within 10 times that of vision tokens (i.e., a compression ratio < 10x), the model can achieve decoding (OCR) precision of 97%. Even at a compression ratio of 20x, the OCR accuracy still remains at about 60%. This shows considerable promise for research areas such as historical long-context compression and memory forgetting mechanisms in LLMs. Beyond this, DeepSeek-OCR also demonstrates high practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗deepseek-ai/DeepSeek-OCR-2model· 1.3M dl· ♡ 8891.3M dl♡ 889
- 🤗deepseek-ai/DeepSeek-OCRmodel· 2.5M dl· ♡ 31992.5M dl♡ 3199
- 🤗kp-forks/DeepSeek-OCRmodel· 8 dl8 dl
- 🤗Jalea96/DeepSeek-OCR-bnb-4bit-NF4model· 1.7k dl· ♡ 161.7k dl♡ 16
- 🤗unsloth/DeepSeek-OCRmodel· 2.1k dl· ♡ 392.1k dl♡ 39
- 🤗ZoneTwelve/DeepSeek-OCR-AnyDevicemodel· 8 dl8 dl
- 🤗applegrew/deepseek-ocr-macosmodel· 12 dl12 dl
- 🤗ginipick/DeepSeek-OCRmodel· 8 dl8 dl
- 🤗seawolf2357/DeepSeek-OCRmodel· 18 dl18 dl
- 🤗fantaxy/DeepSeek-OCRmodel· 6 dl6 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Handwritten Text Recognition Techniques · Advanced Data Storage Technologies
