KSAT: Knowledge-infused Self Attention Transformer -- Integrating   Multiple Domain-Specific Contexts

Kaushik Roy; Yuxin Zi; Vignesh Narayanan; Manas Gaur; Amit Sheth

arXiv:2210.04307·cs.CL·June 27, 2023·1 cites

KSAT: Knowledge-infused Self Attention Transformer -- Integrating Multiple Domain-Specific Contexts

Kaushik Roy, Yuxin Zi, Vignesh Narayanan, Manas Gaur, Amit Sheth

PDF

Open Access

TL;DR

KSAT is a novel transformer architecture that integrates multiple domain-specific contexts using external knowledge sources, enhancing interpretability and outperforming traditional fine-tuning methods in domain-specific language understanding.

Contribution

Introduces KSAT, a self-attention transformer with knowledge-guided biases for integrating multiple domain contexts and controlling knowledge-data trade-offs.

Findings

01

KSAT provides human-understandable visualization of domain contributions.

02

KSAT outperforms fine-tuning baselines on domain-specific tasks.

03

KSAT effectively combines multiple knowledge sources in language models.

Abstract

Domain-specific language understanding requires integrating multiple pieces of relevant contextual information. For example, we see both suicide and depression-related behavior (multiple contexts) in the text ``I have a gun and feel pretty bad about my life, and it wouldn't be the worst thing if I didn't wake up tomorrow''. Domain specificity in self-attention architectures is handled by fine-tuning on excerpts from relevant domain specific resources (datasets and external knowledge - medical textbook chapters on mental health diagnosis related to suicide and depression). We propose a modified self-attention architecture Knowledge-infused Self Attention Transformer (KSAT) that achieves the integration of multiple domain-specific contexts through the use of external knowledge sources. KSAT introduces knowledge-guided biases in dedicated self-attention layers for each knowledge source to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Softmax · Label Smoothing · Adam · Dense Connections