TY - JOUR

T1 - An empirical study of the complexity and randomness of prediction error sequences

AU - Ratsaby, Joel

PY - 2011/7

Y1 - 2011/7

N2 - We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information density ρ while bad learners have a high ρ Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of [18] the learner here acts as a static structure which " scatters" the bits of an input sequence (to be predicted) in proportion to its information density ρ thereby deforming its randomness characteristics.

AB - We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information density ρ while bad learners have a high ρ Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of [18] the learner here acts as a static structure which " scatters" the bits of an input sequence (to be predicted) in proportion to its information density ρ thereby deforming its randomness characteristics.

KW - Algorithmic complexity

KW - Binary sequences

KW - Chaotic scattering

KW - Description complexity

KW - Information theory

KW - Prediction

KW - Statistical learning

UR - http://www.scopus.com/inward/record.url?scp=79951578554&partnerID=8YFLogxK

U2 - 10.1016/j.cnsns.2010.10.015

DO - 10.1016/j.cnsns.2010.10.015

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:79951578554

SN - 1007-5704

VL - 16

SP - 2832

EP - 2844

JO - Communications in Nonlinear Science and Numerical Simulation

JF - Communications in Nonlinear Science and Numerical Simulation

IS - 7

ER -