TY - GEN
T1 - Prediction by compression
AU - Ratsaby, Joel
PY - 2011
Y1 - 2011
N2 - It is well known that text compression can be achieved by predicting the next symbol in the stream of text data based on the history seen up to the current symbol. The better the prediction the more skewed the conditional probability distribution of the next symbol and the shorter the codeword that needs to be assigned to represent this next symbol. What about the opposite direction? suppose we have a black box that can compress text stream. Can it be used to predict the next symbol in the stream? We introduce a novel criterion based on the length of the compressed data and use it to predict the next symbol. We examine empirically the prediction error rate and its dependency on some compression parameters.
AB - It is well known that text compression can be achieved by predicting the next symbol in the stream of text data based on the history seen up to the current symbol. The better the prediction the more skewed the conditional probability distribution of the next symbol and the shorter the codeword that needs to be assigned to represent this next symbol. What about the opposite direction? suppose we have a black box that can compress text stream. Can it be used to predict the next symbol in the stream? We introduce a novel criterion based on the length of the compressed data and use it to predict the next symbol. We examine empirically the prediction error rate and its dependency on some compression parameters.
KW - Data compression
KW - Statistical prediction
KW - Universal sequence prediction
UR - http://www.scopus.com/inward/record.url?scp=79958106148&partnerID=8YFLogxK
U2 - 10.2316/P.2011.721-010
DO - 10.2316/P.2011.721-010
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:79958106148
SN - 9780889868656
T3 - Proceedings of the 8th IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, SPPRA 2011
SP - 282
EP - 288
BT - Proceedings of the 8th IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, SPPRA 2011
T2 - 8th IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, SPPRA 2011
Y2 - 16 February 2011 through 18 February 2011
ER -