تخطي إلى التنقل الرئيسي تخطي إلى البحث تخطي إلى المحتوى الرئيسي

LOCALLY CONSISTENT PARSING FOR TEXT INDEXING IN SMALL SPACE

نتاج البحث: نشر في مجلةمقالةمراجعة النظراء

ملخص

We consider two closely related problems of text indexing in a sublinear working space. The first problem is the sparse suffix tree construction, where a text S is given in read-only memory, along with a set of suffixes B, and the goal is to construct the compressed trie of all these suffixes ordered lexicographically, using only \scrO(|B|) words of space. The second problem is the longest common extension problem, where again a text S of length n is given in read-only memory with some trade-off parameter 1 \leq \tau \leq n, and the goal is to construct a data structure that uses \scrO(n\tau ) words of space and can compute for any pair of suffixes their longest common prefix length as fast as possible as a function of \tau (O(\tau ) time for a randomized Las Vegas data structure or O(\tau \sqrt{}log\ast n) time for a deterministic data structure). We show how to use ideas based on the locally consistent parsing technique, that were introduced by Sahinalp and Vishkin [Proceedings of the 26th Annual ACM Symposium on Theory of Computing, 1994, pp. 300-309], in some nontrivial ways in order to improve the known results for the above problems under the space constraints. We introduce the first almost-linear, \scrO(n \cdot polylog n), deterministic construction for both problems, where all previous algorithms take at least \Omega(min\{n|B|, |nB2| \}) time. We also introduce the first linear-time Las Vegas algorithms for both problems, achieving \scrO(n) construction time with high probability. This is an improvement over the last result of Gawrychowski and Kociumaka [Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms, 2017, pp. 425-439], which obtained \scrO(n) time for the Monte Carlo algorithm and \scrO(n\sqrt{}log |B|) time with high probability for the Las Vegas algorithm.

اللغة الأصليةالإنجليزيّة
الصفحات (من إلى)916-963
عدد الصفحات48
دوريةSIAM Journal on Computing
مستوى الصوت54
رقم الإصدار4
المعرِّفات الرقمية للأشياء
حالة النشرنُشِر - 2025
منشور خارجيًانعم

بصمة

أدرس بدقة موضوعات البحث “LOCALLY CONSISTENT PARSING FOR TEXT INDEXING IN SMALL SPACE'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا