openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit
Maximilian Schmitt, Bj\"orn W. Schuller

TL;DR
openXBOW is an open-source toolkit that generates crossmodal bag-of-words representations from multimodal data, supporting various input types and enabling improved emotion recognition and sentiment analysis.
Contribution
It introduces the first publicly available toolkit for crossmodal bag-of-words, supporting arbitrary numeric and text features with flexible extensions.
Findings
Improved emotion recognition accuracy using BoW features.
Enhanced sentiment analysis performance on tweet data.
Supports diverse input modalities and feature types.
Abstract
We introduce openXBOW, an open-source toolkit for the generation of bag-of-words (BoW) representations from multimodal input. In the BoW principle, word histograms were first used as features in document classification, but the idea was and can easily be adapted to, e.g., acoustic or visual low-level descriptors, introducing a prior step of vector quantisation. The openXBOW toolkit supports arbitrary numeric input features and text input and concatenates computed subbags to a final bag. It provides a variety of extensions and options. To our knowledge, openXBOW is the first publicly available toolkit for the generation of crossmodal bags-of-words. The capabilities of the tool are exemplified in two sample scenarios: time-continuous speech-based emotion recognition and sentiment analysis in tweets where improved results over other feature representation forms were observed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Music and Audio Processing · Advanced Text Analysis Techniques
