Accelerating the LZ-complexity algorithm

Joel Ratsaby, Alexander Timashkov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The Lempel Ziv complexity of a string has recently been used in pattern recognition and classification as part of a string distance function. Its main advantage is that it can measure dissimilarity between a pair of strings of different lengths. This is very useful for machine learning on unstructured data since such data is not restricted to a fixed input dimensionality. The standard computation of LZ-complexity is inherently serial and is not suitable for processing large unstructured data. Hence, we propose a parallel algorithm that computes the LZ-complexity of strings whose length is limited only by the amount of memory, typically in the tens of gigabytes. The algorithm is implemented in CUDA on a GPU. Its speed-up factor is approximately n2/3 for strings of length n, for at least up to n = 2Mb. For instance, on 2Mb strings, the speed-up is 150. We compare the execution times of kernel variants with shared and global memory. The more efficient variant obtains approximately 90% GPU utilization.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE 29th International Conference on Parallel and Distributed Systems, ICPADS 2023
PublisherIEEE Computer Society
Pages200-207
Number of pages8
ISBN (Electronic)9798350330717
DOIs
StatePublished - 2023
Event29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023 - Ocean Flower Island, Hainan, China
Duration: 17 Dec 202321 Dec 2023

Publication series

NameProceedings of the International Conference on Parallel and Distributed Systems - ICPADS
ISSN (Print)1521-9097

Conference

Conference29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023
Country/TerritoryChina
CityOcean Flower Island, Hainan
Period17/12/2321/12/23

Keywords

  • CUDA
  • GPU
  • LZ-complexity
  • UID distance
  • string distance

Fingerprint

Dive into the research topics of 'Accelerating the LZ-complexity algorithm'. Together they form a unique fingerprint.

Cite this