TY - JOUR
T1 - Dynamic determination of variable sizes of chunks in a deduplication system
AU - Hirsch, Michael
AU - Klein, Shmuel T.
AU - Shapira, Dana
AU - Toaff, Yair
N1 - Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2020/3/15
Y1 - 2020/3/15
N2 - Deduplication is a special case of data compression in which repeated chunks of data are stored only once. The input data is cut into chunks and a cryptographically strong hash value of each (different) chunk is stored. To restrict the influence of small inserts and deletes to local perturbations, the chunk boundaries are usually defined in a data dependent way, which implies that the chunks are of variable length. Usually, the chunk sizes may spread over a large range, which might have a negative impact on the storage performance. This can be dealt with by imposing artificial lower and upper bounds. This paper proposes an alternative by which the chunk size distribution is controlled in a natural way. Some analytical and experimental results are given.
AB - Deduplication is a special case of data compression in which repeated chunks of data are stored only once. The input data is cut into chunks and a cryptographically strong hash value of each (different) chunk is stored. To restrict the influence of small inserts and deletes to local perturbations, the chunk boundaries are usually defined in a data dependent way, which implies that the chunks are of variable length. Usually, the chunk sizes may spread over a large range, which might have a negative impact on the storage performance. This can be dealt with by imposing artificial lower and upper bounds. This paper proposes an alternative by which the chunk size distribution is controlled in a natural way. Some analytical and experimental results are given.
KW - Chunk size
KW - Compression
KW - Deduplication
UR - http://www.scopus.com/inward/record.url?scp=85051494812&partnerID=8YFLogxK
U2 - 10.1016/j.dam.2018.07.015
DO - 10.1016/j.dam.2018.07.015
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85051494812
SN - 0166-218X
VL - 274
SP - 81
EP - 91
JO - Discrete Applied Mathematics
JF - Discrete Applied Mathematics
ER -