On the Randomness of Compressed Data

Shmuel T. Klein, Dana Shapira

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


It seems reasonable to expect from a good compression method that its output should not be further compressible, because it should behave essentially like random data. We investigate this premise for a variety of known compression techniques, and find that, surprisingly, there is much variability in the randomness, depending on the chosen method. Arithmetic coding seems to produce perfectly random output, whereas that of Huffman or Ziv-Lempel coding still contains many dependencies. In particular, the output of Huffman coding has already been proven to be random under certain conditions, and we show here that arithmetic coding may produce an output that is identical to that of Huffman.

Original languageEnglish
Title of host publicationProceedings - DCC 2019
Subtitle of host publication2019 Data Compression Conference
EditorsJames A. Storer, Joan Serra-Sagrista, Ali Bilgin, Michael W. Marcellin
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages1
ISBN (Electronic)9781728106571
StatePublished - 10 May 2019
Event2019 Data Compression Conference, DCC 2019 - Snowbird, United States
Duration: 26 Mar 201929 Mar 2019

Publication series

NameData Compression Conference Proceedings
ISSN (Print)1068-0314


Conference2019 Data Compression Conference, DCC 2019
Country/TerritoryUnited States


  • Kullback Leibler
  • Lossless Compression
  • Randomness


Dive into the research topics of 'On the Randomness of Compressed Data'. Together they form a unique fingerprint.

Cite this