Distributed In-Context Learning under Non-IID Among Clients
Siqi Liang, Sumyeong Ahn, Jiayu Zhou

TL;DR
This paper addresses the challenge of in-context learning with distributed, non-IID client data by proposing a data-driven client contribution allocation method, improving performance in multi-client scenarios.
Contribution
It introduces a novel approach for allocating data usage budgets among clients based on query preferences in non-IID distributed settings for in-context learning.
Findings
Outperforms baseline methods in diverse datasets
Effective client contribution allocation improves ICL performance
Addresses non-IID data challenges in distributed environments
Abstract
Advancements in large language models (LLMs) have shown their effectiveness in multiple complicated natural language reasoning tasks. A key challenge remains in adapting these models efficiently to new or unfamiliar tasks. In-context learning (ICL) provides a promising solution for few-shot adaptation by retrieving a set of data points relevant to a query, called in-context examples (ICE), from a training dataset and providing them during the inference as context. Most existing studies utilize a centralized training dataset, yet many real-world datasets may be distributed among multiple clients, and remote data retrieval can be associated with costs. Especially when the client data are non-identical independent distributions (non-IID), retrieving from clients a proper set of ICEs needed for a test query presents critical challenges. In this paper, we first show that in this challenging…
Peer Reviews
Decision·Submitted to ICLR 2025
1. This paper highlights the important yet underexplored problem of distributed non-IID in-context learning (ICL), offering fresh insights into a new problem. 2. The paper presents some valuable experimental results demonstrating the poor performance of distributed non-IID ICL under simple uniform budget allocation, effectively validating the significance of the problem. 3. The method proposed in this paper have improved the performance in a simple while effective way.
1. The paper does not provide concrete examples of distributed non-IID ICL scenarios. So i don't get that given the server can request budgeted samples from each client, why can't these samples be used to simulate a comprehensive, unbiased retrieval pool for inference? 2. The training dataset for the allocator is constructed by retrieving k samples per query from each client, combining these 𝐶×𝑘 samples to simulate a unified dataset. However, this simulation raises questions. If the simulation
1. Distributed ICL under non-IID conditions is an interesting problem and aligns well with real-world scenarios. This paper explore the challenges under this setting, providing some meaningful insights. 2. The proposed method is simple yet effective across several benchmarks, with low training overhead.
1. In real-world scenarios, data distribution differences can manifest in multiple aspects, such as text length, style, etc., but this paper only focus on Non-IIDess at the class level. I strongly recommend the author to take more aspects into considerations. 2. Since the training of allocator does not require the label of examples, the experiments should not be limited to classfication tasks. The effectiveness of allocator on generation tasks remains to be validated. 3. The partition for non-I
Pros: 1). The paper is well structured 2). The experiments are thorough.
Cons: 1). The novelty and technical depth is limited. The core idea is the server will gather the optimal budget statistics using an existing proxy dataset on the server side. This however is pretty straightforward. 2). I think the work is highly related to the distributed RAG work. THe authors are suggested to include the discussion of the difference of existing distributed RAG works and compare with these approaches if possible. 3). The paper only uses small open-sourced LLMs, such as GPT-
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Anomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training
