Joshua Sobel presents "AWLCO All-Window Length Co-Occurrence"
Abstract: Analyzing patterns in a sequence of events has applications in text analysis, computer programming, and genomics research. In this paper, we consider the all-window-length analysis model which analyzes a sequence of events with respect to windows of all lengths. We study the exact co-occurrence counting problem for the all-window-length analysis model. Our first algorithm is an offline algorithm that counts all-window-length co-occurrences by performing multiple passes over a sequence and computing single-window-length co-occurrences. This algorithm has the time complexity \(O(n)\) for each window length and thus a total complexity of \(O(n^2)\) and the space complexity \(O(\lvert I \rvert)\) for a sequence of size n and an itemset of size \(\lvert I \rvert\). We propose AWLCO, an online algorithm that computes all-window-length co-occurrences in a single pass with the time complexity of \(O(n)\) and space complexity of \(O(\sqrt{n\lvert I\rvert})\), assuming perfect hashing. Following this, we generalize our use case to patterns in which we propose an algorithm that computes all-window-length co-occurrence with time complexity \(O(n\lvert I \rvert)\), assuming perfect hashing, with an additional pre-processing step and space complexity \(O(\sqrt{n \lvert I \rvert}+ \lvert I \rvert)\), plus the overhead of the Aho-Corasick algorithm [Aho and Corasick, 1975].
Enjoy Reading This Article?
Here are some more articles you might like to read next: