Secu-Table: a Comprehensive security table dataset for evaluating semantic table interpretation systems
Azanzi Jiomekong, Jean Bikim, Patricia Negoue, Joyce Chin

TL;DR
Secu-Table is a new, publicly available dataset of over 1500 security-related tables with 15,000+ entities, designed to evaluate semantic table interpretation systems, especially those based on large language models, in cybersecurity contexts.
Contribution
The paper introduces Secu-Table, a comprehensive security table dataset with annotations, and provides baseline evaluations using open source LLMs for semantic table interpretation in cybersecurity.
Findings
Baseline results established for LLM-based STI systems.
Secu-Table enables benchmarking in security domain.
Open source code and dataset released for community use.
Abstract
Evaluating semantic tables interpretation (STI) systems, (particularly, those based on Large Language Models- LLMs) especially in domain-specific contexts such as the security domain, depends heavily on the dataset. However, in the security domain, tabular datasets for state-of-the-art are not publicly available. In this paper, we introduce Secu-Table dataset, composed of more than 1500 tables with more than 15k entities constructed using security data extracted from Common Vulnerabilities and Exposures (CVE) and Common Weakness Enumeration (CWE) data sources and annotated using Wikidata and the SEmantic Processing of Security Event Streams CyberSecurity Knowledge Graph (SEPSES CSKG). Along with the dataset, all the code is publicly released. This dataset is made available to the research community in the context of the SemTab challenge on Tabular to Knowledge Graph Matching. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Graph Neural Networks · Machine Learning in Healthcare
