Sensei: Self-Supervised Sensor Name Segmentation
Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang

TL;DR
Sensei is a self-supervised neural framework that automatically segments sensor names in smart buildings, reducing manual effort and handling diverse naming conventions effectively.
Contribution
Sensei introduces a novel self-supervised approach using neural language models to segment sensor names without human annotations, adaptable to various building-specific naming schemes.
Findings
Outperforms baseline segmentation methods on real-world datasets
Effective across multiple buildings with diverse naming conventions
Reduces manual annotation effort significantly
Abstract
A sensor name, typically an alphanumeric string, encodes the key context (e.g., function and location) of a sensor needed for deploying smart building applications. Sensor names, however, are curated in a building vendor-specific manner using different structures and vocabularies that are often esoteric. They thus require tremendous manual effort to annotate on a per-building basis; even to just segment these sensor names into meaningful chunks. In this paper, we propose a fully automated self-supervised framework, Sensei, which can learn to segment sensor names without any human annotation. Specifically, we employ a neural language model to capture the underlying sensor naming structure and then induce self-supervision based on information from the language model to build the segmentation model. Extensive experiments on five real-world buildings comprising thousands of sensors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
