LoFTI: Localization and Factuality Transfer to Indian Locales
Sona Elza Simon (1), Soumen Kumar Mondal (1), Abhishek Singhania (2),, Sayambhu Sen (2), Preethi Jyothi (1) ((1) Indian Institute of Technology, Bombay, (2) Amazon Alexa)

TL;DR
This paper introduces LoFTI, a benchmark for evaluating how well large language models can transfer factual knowledge to Indian locales, revealing biases and localization capabilities in models like GPT-4.
Contribution
The paper presents LoFTI, a novel benchmark for assessing localization and factual transfer in LLMs, and evaluates multiple models on this new dataset.
Findings
Models show bias and skewed results in localized factual accuracy.
LoFTI effectively measures localization and factual transfer capabilities.
GPT-4 and others exhibit varying performance across hyperlocal levels.
Abstract
Large language models (LLMs) encode vast amounts of world knowledge acquired via training on large web-scale datasets crawled from the internet. However, these datasets typically exhibit a geographical bias towards English-speaking Western countries. This results in LLMs producing biased or hallucinated responses to queries that require answers localized to other geographical regions. In this work, we introduce a new benchmark named LoFTI (Localization and Factuality Transfer to Indian Locales) that can be used to evaluate an LLM's localization and factual text transfer capabilities. LoFTI consists of factual statements about entities in source and target locations; the source locations are spread across the globe and the target locations are all within India with varying degrees of hyperlocality (country, states, cities). The entities span a wide variety of categories. We use LoFTI to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDisaster Management and Resilience
MethodsAttention Is All You Need · Residual Connection · Adam · Dropout · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer
