The Overview of Segmental Durations Modification Algorithms on Speech Signal Characteristics
Kyeomeun Jang, Jiaying Li, Yinuo Wang

TL;DR
This paper evaluates algorithms for modifying speech signal durations arbitrarily without altering core properties, enabling flexible speech editing for various applications.
Contribution
It provides a comprehensive analysis of mainstream algorithms for arbitrary duration modification of speech signals, highlighting their capabilities and limitations.
Findings
Algorithms can modify speech durations without affecting pitch or spectrum.
Multiple intervals can be modified simultaneously.
The analysis guides future development of speech editing techniques.
Abstract
This paper deeply evaluates and analyzes several mainstream algorithms that can arbitrarily modify the duration of any portion of a given speech signal without changing the essential properties (e.g., pitch contour, power spectrum, etc.) of the original signal. Arbitrary modification in this context means that the duration of any region of the signal can be changed by specifying the starting and ending time for modification or the target duration of the specified interval, which can be either a fixed value of duration in the time domain or a scaling factor of the original duration. In addition, arbitrary modification also indicates any number of intervals can be modified at the same time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
