# Heterogeneous domain adaptation: An unsupervised approach

**Authors:** Feng Liu, Guanquan Zhang, Jie Lu

arXiv: 1701.02511 · 2020-02-11

## TL;DR

This paper introduces an unsupervised method for heterogeneous domain adaptation that guarantees knowledge transfer correctness and measures domain similarity, achieving superior results across multiple applications.

## Contribution

It presents a novel unsupervised transfer theorem, a principal angle-based domain distance metric, and the GLG model for effective heterogeneous unsupervised domain adaptation.

## Key findings

- The GLG model outperforms existing baselines on five datasets.
- The proposed metric effectively measures domain similarity.
- The method is applicable to diverse fields like healthcare, finance, and text analysis.

## Abstract

Domain adaptation leverages the knowledge in one domain - the source domain - to improve learning efficiency in another domain - the target domain. Existing heterogeneous domain adaptation research is relatively well-progressed, but only in situations where the target domain contains at least a few labeled instances. In contrast, heterogeneous domain adaptation with an unlabeled target domain has not been well-studied. To contribute to the research in this emerging field, this paper presents: (1) an unsupervised knowledge transfer theorem that guarantees the correctness of transferring knowledge; and (2) a principal angle-based metric to measure the distance between two pairs of domains: one pair comprises the original source and target domains and the other pair comprises two homogeneous representations of two domains. The theorem and the metric have been implemented in an innovative transfer model, called a Grassmann-Linear monotonic maps-geodesic flow kernel (GLG), that is specifically designed for heterogeneous unsupervised domain adaptation (HeUDA). The linear monotonic maps meet the conditions of the theorem and are used to construct homogeneous representations of the heterogeneous domains. The metric shows the extent to which the homogeneous representations have preserved the information in the original source and target domains. By minimizing the proposed metric, the GLG model learns the homogeneous representations of heterogeneous domains and transfers knowledge through these learned representations via a geodesic flow kernel. To evaluate the model, five public datasets were reorganized into ten HeUDA tasks across three applications: cancer detection, credit assessment, and text classification. The experiments demonstrate that the proposed model delivers superior performance over the existing baselines.

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1701.02511/full.md

---
Source: https://tomesphere.com/paper/1701.02511