Efficient Knowledge Feeding to Language Models: A Novel Integrated   Encoder-Decoder Architecture

S Santosh Kumar; Rishi Gottimukkala; Supriya Devidutta; Karthikeyan S

arXiv:2502.05233·cs.CL·February 11, 2025

Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture

S Santosh Kumar, Rishi Gottimukkala, Supriya Devidutta, Karthikeyan S

PDF

Open Access

TL;DR

This paper presents a new integrated encoder-decoder architecture using in-context vectors to efficiently feed knowledge into language models, overcoming token limits and improving performance across various tasks.

Contribution

The novel ICV approach recasts in-context learning by embedding task information into latent vectors, enhancing knowledge integration without extensive prompt modifications.

Findings

01

ICV outperforms standard in-context learning and fine-tuning.

02

ICV reduces computational costs and memory usage.

03

ICV surpasses token limitations and is easy to control.

Abstract

This paper introduces a novel approach to efficiently feeding knowledge to language models (LLMs) during prediction by integrating retrieval and generation processes within a unified framework. While the Retrieval-Augmented Generation (RAG) model addresses gaps in LLMs' training data and knowledge limits, it is hindered by token limit restrictions and dependency on the retrieval system's accuracy. Our proposed architecture incorporates in-context vectors (ICV) to overcome these challenges. ICV recasts in-context learning by using latent embeddings of LLMs to create a vector that captures essential task information. This vector is then used to shift the latent states of the LLM, enhancing the generation process without adding demonstration examples to the prompt. ICV directly integrates information into the model, enabling it to process this information more effectively. Our extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Neural Networks and Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Weight Decay · Attention Dropout · Byte Pair Encoding · WordPiece · Layer Normalization · Residual Connection · Dense Connections