# Combinatorial Entropy Encoding

**Authors:** Abu Bakar Siddique

arXiv: 1703.08127 · 2017-03-24

## TL;DR

This paper introduces a novel lossless data compression method that encodes messages by their lexicographic index among permutations, matching Shannon entropy without needing prior source models.

## Contribution

The paper presents a new entropy encoding technique based on lexicographic permutation indices, eliminating the need for prior entropy models unlike Huffman or arithmetic coding.

## Key findings

- Achieves compression matching Shannon entropy
- Does not require prior source entropy model
- Uses a simple algorithm based on multinomial coefficients

## Abstract

This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of the message. Commercial data compression standards make use of Huffman or arithmetic coding at some stage of the compression process. In the proposed method, like arithmetic coding entire string is mapped to an integer but is not based on fractional numbers. Unlike both arithmetic and Huffman coding no prior entropy model of the source is required. Simple intuitive algorithm based on multinomial coefficients is developed for entropy encoding that adoptively uses low number of bits for more frequent symbols. Correctness of the algorithm is demonstrated by an example.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.08127/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1703.08127/full.md

## References

4 references — full list in the complete paper: https://tomesphere.com/paper/1703.08127/full.md

---
Source: https://tomesphere.com/paper/1703.08127