The Distribution and Deposition Algorithm for Multiple Sequences Sets
Kang Ning, Hon Wai Leong

TL;DR
This paper introduces the Distribution and Deposition Algorithms (DDA and DDA*) for efficiently processing multiple sequence sets, improving cost and performance over existing methods in large-scale text processing applications.
Contribution
It proposes novel algorithms DDA and DDA* for the processing of multiple sequence sets, with analysis and experimental validation showing superior performance.
Findings
DDA and DDA* algorithms produce lower costs than other methods.
DDA* generally outperforms DDA in most instances.
Both algorithms are efficient in time and space.
Abstract
Sequences set is a mathematical model used in many applications. As the number of the sequences becomes larger, single sequence set model is not appropriate for the rapidly increasing problem sizes. For example, more and more text processing applications separate a single big text file into multiple files before processing. For these applications, the underline mathematical model is multiple sequences sets (MSS). Though there is increasing use of MSS, there is little research on how to process MSS efficiently. To process multiple sequences sets, sequences are first distributed to different sets, and then sequences for each set are processed. Deriving effective algorithm for MSS processing is both interesting and challenging. In this paper, we have defined the cost functions and performance ratio for analysis of the quality of synthesis sequences. Based on these, the problem of Process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Distributed and Parallel Computing Systems · Web Data Mining and Analysis
