TY - JOUR

T1 - Bounding the bias of tree-like sampling in ip topologies

AU - Cohen, Reuven

AU - Gonen, Mira

AU - Wool, Avishai

PY - 2008

Y1 - 2008

N2 - It is widely believed that the Internet's AS-graph degree distribution obeys a power-law form. However, it was recently argued that since Internet data is collected in a tree-like fashion, it only produces a sample of the degree distribution, and this sample may be biased. This argument was backed by simulation data and mathematical analysis, which demonstrated that under certain conditions a tree sampling procedure can produce an artificial power-law in the degree distribution. Thus, although the observed degree distribution of the AS-graph follows a power-law, this phenomenon may be an artifact of the sampling process. In this work we provide some evidence to the contrary. We show, by analysis and simulation, that when the underlying graph degree distribution obeys a power-law with an exponent γ > 2, a tree-like sampling process produces a negligible bias in the sampled degree distribution. Furthermore, recent data collected from the DIMES project, which is not based on single source sampling, indicates that the Internet indeed obeys a power-law degree distribution with an exponent γ > 2. Combining this empirical data with our simulation of traceroute experiments on DIMES-measured AS-graph as the underlying graph, and with our analysis, we conclude that the bias in the degree distribution calculated from BGP data is negligible.

AB - It is widely believed that the Internet's AS-graph degree distribution obeys a power-law form. However, it was recently argued that since Internet data is collected in a tree-like fashion, it only produces a sample of the degree distribution, and this sample may be biased. This argument was backed by simulation data and mathematical analysis, which demonstrated that under certain conditions a tree sampling procedure can produce an artificial power-law in the degree distribution. Thus, although the observed degree distribution of the AS-graph follows a power-law, this phenomenon may be an artifact of the sampling process. In this work we provide some evidence to the contrary. We show, by analysis and simulation, that when the underlying graph degree distribution obeys a power-law with an exponent γ > 2, a tree-like sampling process produces a negligible bias in the sampled degree distribution. Furthermore, recent data collected from the DIMES project, which is not based on single source sampling, indicates that the Internet indeed obeys a power-law degree distribution with an exponent γ > 2. Combining this empirical data with our simulation of traceroute experiments on DIMES-measured AS-graph as the underlying graph, and with our analysis, we conclude that the bias in the degree distribution calculated from BGP data is negligible.

KW - Internet topology models

KW - Network computing

UR - http://www.scopus.com/inward/record.url?scp=56649084859&partnerID=8YFLogxK

U2 - 10.3934/nhm.2008.3.323

DO - 10.3934/nhm.2008.3.323

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:56649084859

SN - 1556-1801

VL - 3

SP - 323

EP - 332

JO - Networks and Heterogeneous Media

JF - Networks and Heterogeneous Media

IS - 2

ER -