Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 14 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 472 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Towards Unsupervised Domain Adaptation via Domain-Transformer (2202.13777v2)

Published 24 Feb 2022 in cs.CV

Abstract: As a vital problem in pattern analysis and machine intelligence, Unsupervised Domain Adaptation (UDA) attempts to transfer an effective feature learner from a labeled source domain to an unlabeled target domain. Inspired by the success of the Transformer, several advances in UDA are achieved by adopting pure transformers as network architectures, but such a simple application can only capture patch-level information and lacks interpretability. To address these issues, we propose the Domain-Transformer (DoT) with domain-level attention mechanism to capture the long-range correspondence between the cross-domain samples. On the theoretical side, we provide a mathematical understanding of DoT: 1) We connect the domain-level attention with optimal transport theory, which provides interpretability from Wasserstein geometry; 2) From the perspective of learning theory, Wasserstein distance-based generalization bounds are derived, which explains the effectiveness of DoT for knowledge transfer. On the methodological side, DoT integrates the domain-level attention and manifold structure regularization, which characterize the sample-level information and locality consistency for cross-domain cluster structures. Besides, the domain-level attention mechanism can be used as a plug-and-play module, so DoT can be implemented under different neural network architectures. Instead of explicitly modeling the distribution discrepancy at domain-level or class-level, DoT learns transferable features under the guidance of long-range correspondence, so it is free of pseudo-labels and explicit domain discrepancy optimization. Extensive experiment results on several benchmark datasets validate the effectiveness of DoT.

Citations (5)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.