HKGAI-V1: Towards Regional Sovereign Large Language Model for Hong Kong

Sirui Han; Junqi Zhu; Ruiyuan Zhang; Yike Guo

arXiv:2507.11502·cs.CL·July 16, 2025

HKGAI-V1: Towards Regional Sovereign Large Language Model for Hong Kong

Sirui Han, Junqi Zhu, Ruiyuan Zhang, Yike Guo

PDF

Open Access

TL;DR

This paper introduces HKGAI-V1, a region-specific large language model for Hong Kong, designed to handle local languages, cultural norms, and legal standards, with an emphasis on AI safety and governance.

Contribution

The paper presents the development of HKGAI-V1, a sovereign LLM tailored for Hong Kong's multilingual and socio-legal context, with a novel alignment and safety framework.

Findings

01

HKGAI-V1 outperforms general models on local queries.

02

The model demonstrates effective handling of culturally sensitive topics.

03

The proprietary Adversarial HK Value Benchmark assesses alignment with local standards.

Abstract

This paper presents the development of HKGAI-V1, a foundational sovereign large language model (LLM), developed as part of an initiative to establish value-aligned AI infrastructure specifically tailored for Hong Kong. Addressing the region's unique multilingual environment (Cantonese, Mandarin, and English), its distinct socio-legal context under the "one country, two systems" framework, and specific local cultural and value considerations, the model is built upon the DeepSeek architecture and systematically aligned with regional norms through a multifaceted full parameter fine-tuning process. It is further integrated with a retrieval-augmented generation (RAG) system to ensure timely and factually grounded information access. The core contribution lies in the design and implementation of a comprehensive, region-specific AI alignment and safety framework, demonstrated through two key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques