A Comparative Study of the Signal-to-Noise Ratios of Different Representations for Symbolic Sequences
Jiasong Wang, Chuangyin Dang, Changchuan Yin

TL;DR
This paper mathematically analyzes the signal-to-noise ratios of various numerical representations of symbolic sequences, such as DNA or proteins, revealing how these ratios depend on the sequence length and the representation method.
Contribution
It introduces a mathematical framework for evaluating the signal-to-noise ratios of different symbolic sequence representations and their spectral properties.
Findings
Signal-to-noise ratio of special representations is T/(T-1) times that of base vector representations.
Fourier spectrum total magnitude is proportional to the square of sequence length.
Results depend on symbol distribution and mathematical construction, not biological meaning.
Abstract
Based on the numerical representations by T basic vectors of a symbolic sequence consisting of T symbols, first, we prove mathematical that the total Fourier spectrum of the sequence is the square of the length of the sequence. In the meantime, we define the indicator sequences vector. Using the orthogonal or row orthogonal transformations of the indicator sequences vector, we construct some special numerical representations of the symbolic sequence and characterize the signal-to-noise ratios of the power spectrum of the numerical representations. After calculating the discrete Fourier transform of those special numerical representations, the signal-to-noise ratios of them can be figured out. Mathematical theorems prove that the signal-to-noise ratio of the Fourier spectrum of those special representations of the symbolic sequence is T/(T-1) times the signal-to-noise ratio of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFractal and DNA sequence analysis · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms
