# INR Smooth: Interframe noise relation-based smooth video synthesis on diffusion models

**Authors:** Cuihong Yu, Cheng Han, Chao Zhang

PMC · DOI: 10.1371/journal.pone.0321193 · PLOS One · 2025-04-29

## TL;DR

This paper introduces INR Smooth, a video smoothing strategy that improves frame consistency and text alignment in text-to-video generation without excessive smoothing.

## Contribution

The novel interframe noise relation-based smoothing strategy enhances video smoothness while preserving artistic expression and text alignment.

## Key findings

- INR Smooth significantly improves text alignment and temporal consistency in video generation.
- The proposed methods achieve smooth transitions in real scenes and artistic styles without additional computational resources.
- Training-free and zero-shot fine-tuning approaches are effective for video smoothing.

## Abstract

The text-to-video generation task can provide people with rich and diverse video content, but it also has some typical issues, such as content inconsistency between video frames or text alignment failure, which degrade the smoothness of video. And in the process of improving the video smoothing problems, the background texture and artistic expression are often lost because of the excessive smoothing. Based on the above problems, this paper proposes INR Smooth, a type of video smoothing strategy based on the relationship between interframe noise, which can improve the smoothness of most T2V generation tasks. Based on INR Smooth, two video smoothing editing methods are proposed in this paper. One is for T2V training models, based on the studied interframe noise relationship, noise constraints are carried out from the beginning and end of the video simultaneously, and video smoothing loss functions are constructed. The other is for T2V training-free models, this paper introduces DDIM Inversion additionally to ensure text alignment, so as to improve the smoothness. Through experimental comparison, it is found that the proposed methods can significantly improve text alignment, temporal consistency, and has outstanding performance in the smooth transition of real scenes and the portrayal of artistic styles. The proposed training-free method and zero-shot fine-tuning training method for video smoothing do not add additional computing resources. The source codes and video demos are available at https://github.com/Cuihong-Yu/INR-Smooth.

## Full-text entities

- **Chemicals:** FateZero (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Oryctolagus cuniculus (domestic rabbit, species) [taxon 9986]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12040257/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12040257/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC12040257/full.md

---
Source: https://tomesphere.com/paper/PMC12040257