Distributed In-Context Learning under Non-IID Among Clients

Siqi Liang; Sumyeong Ahn; Jiayu Zhou

arXiv:2408.00144·cs.CL·August 2, 2024

Distributed In-Context Learning under Non-IID Among Clients

Siqi Liang, Sumyeong Ahn, Jiayu Zhou

PDF

Open Access 3 Reviews

TL;DR

This paper addresses the challenge of in-context learning with distributed, non-IID client data by proposing a data-driven client contribution allocation method, improving performance in multi-client scenarios.

Contribution

It introduces a novel approach for allocating data usage budgets among clients based on query preferences in non-IID distributed settings for in-context learning.

Findings

01

Outperforms baseline methods in diverse datasets

02

Effective client contribution allocation improves ICL performance

03

Addresses non-IID data challenges in distributed environments

Abstract

Advancements in large language models (LLMs) have shown their effectiveness in multiple complicated natural language reasoning tasks. A key challenge remains in adapting these models efficiently to new or unfamiliar tasks. In-context learning (ICL) provides a promising solution for few-shot adaptation by retrieving a set of data points relevant to a query, called in-context examples (ICE), from a training dataset and providing them during the inference as context. Most existing studies utilize a centralized training dataset, yet many real-world datasets may be distributed among multiple clients, and remote data retrieval can be associated with costs. Especially when the client data are non-identical independent distributions (non-IID), retrieving from clients a proper set of ICEs needed for a test query presents critical challenges. In this paper, we first show that in this challenging…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 4

Strengths

1. This paper highlights the important yet underexplored problem of distributed non-IID in-context learning (ICL), offering fresh insights into a new problem. 2. The paper presents some valuable experimental results demonstrating the poor performance of distributed non-IID ICL under simple uniform budget allocation, effectively validating the significance of the problem. 3. The method proposed in this paper have improved the performance in a simple while effective way.

Weaknesses

1. The paper does not provide concrete examples of distributed non-IID ICL scenarios. So i don't get that given the server can request budgeted samples from each client, why can't these samples be used to simulate a comprehensive, unbiased retrieval pool for inference? 2. The training dataset for the allocator is constructed by retrieving k samples per query from each client, combining these 𝐶×𝑘 samples to simulate a unified dataset. However, this simulation raises questions. If the simulation

Reviewer 02Rating 6Confidence 4

Strengths

1. Distributed ICL under non-IID conditions is an interesting problem and aligns well with real-world scenarios. This paper explore the challenges under this setting, providing some meaningful insights. 2. The proposed method is simple yet effective across several benchmarks, with low training overhead.

Weaknesses

1. In real-world scenarios, data distribution differences can manifest in multiple aspects, such as text length, style, etc., but this paper only focus on Non-IIDess at the class level. I strongly recommend the author to take more aspects into considerations. 2. Since the training of allocator does not require the label of examples, the experiments should not be limited to classfication tasks. The effectiveness of allocator on generation tasks remains to be validated. 3. The partition for non-I

Reviewer 03Rating 5Confidence 4

Strengths

Pros: 1). The paper is well structured 2). The experiments are thorough.

Weaknesses

Cons: 1). The novelty and technical depth is limited. The core idea is the server will gather the optimal budget statistics using an existing proxy dataset on the server side. This however is pretty straightforward. 2). I think the work is highly related to the distributed RAG work. THe authors are suggested to include the discussion of the difference of existing distributed RAG works and compare with these approaches if possible. 3). The paper only uses small open-sourced LLMs, such as GPT-

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsContext-Aware Activity Recognition Systems · Anomaly Detection Techniques and Applications

MethodsSparse Evolutionary Training