Loading paper
Speech-Omni-Lite: Portable Speech Interfaces for Vision-Language Models | Tomesphere