TY - JOUR
T1 - An empirical study of the complexity and randomness of prediction error sequences
AU - Ratsaby, Joel
PY - 2011/7
Y1 - 2011/7
N2 - We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information density ρ while bad learners have a high ρ Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of [18] the learner here acts as a static structure which " scatters" the bits of an input sequence (to be predicted) in proportion to its information density ρ thereby deforming its randomness characteristics.
AB - We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information density ρ while bad learners have a high ρ Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of [18] the learner here acts as a static structure which " scatters" the bits of an input sequence (to be predicted) in proportion to its information density ρ thereby deforming its randomness characteristics.
KW - Algorithmic complexity
KW - Binary sequences
KW - Chaotic scattering
KW - Description complexity
KW - Information theory
KW - Prediction
KW - Statistical learning
UR - http://www.scopus.com/inward/record.url?scp=79951578554&partnerID=8YFLogxK
U2 - 10.1016/j.cnsns.2010.10.015
DO - 10.1016/j.cnsns.2010.10.015
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:79951578554
SN - 1007-5704
VL - 16
SP - 2832
EP - 2844
JO - Communications in Nonlinear Science and Numerical Simulation
JF - Communications in Nonlinear Science and Numerical Simulation
IS - 7
ER -