Counting Subwords in Circular Words and Their Parikh Matrices
Ghajendran Poovanandran, Jamie Simpson, and Wen Chean Teh

TL;DR
This paper explores methods for counting subwords in circular words, extends Parikh matrices to this context, and investigates ambiguity in word identification, with implications for applications like splicing systems.
Contribution
It introduces two new methods for counting subwords in circular words and extends Parikh matrices to this setting, addressing ambiguity in word inference.
Findings
Developed two counting methods for subwords in circular words
Extended Parikh matrices to circular words
Identified rewriting rules generating ambiguous circular words
Abstract
The word inference problem is to determine languages such that the information on the number of occurrences of those subwords in the language can uniquely identify a word. A considerable amount of work has been done on this problem, but the same cannot be said for circular words despite growing interests on the latter due to their applications -- for example, in splicing systems. Meanwhile, Parikh matrices are useful tools and well established in the study of subword occurrences. In this work, we propose two ways of counting subword occurrences in circular words. We then extend the idea of Parikh matrices to the context of circular words and investigate this extension. Motivated by the word inference problem, we study ambiguity in the identification of a circular word by its Parikh matrix. Accordingly, two rewriting rules are developed to generate ternary circular words which share the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · semigroups and automata theory · Cellular Automata and Applications
