Source-Free Domain Adaptation for Question Answering with Masked Self-training (2212.09563v2)
Abstract: Most previous unsupervised domain adaptation (UDA) methods for question answering(QA) require access to source domain data while fine-tuning the model for the target domain. Source domain data may, however, contain sensitive information and may be restricted. In this study, we investigate a more challenging setting, source-free UDA, in which we have only the pretrained source model and target domain data, without access to source domain data. We propose a novel self-training approach to QA models that integrates a unique mask module for domain adaptation. The mask is auto-adjusted to extract key domain knowledge while trained on the source domain. To maintain previously learned domain knowledge, certain mask weights are frozen during adaptation, while other weights are adjusted to mitigate domain shifts with pseudo-labeled samples generated in the target domain. %As part of the self-training process, we generate pseudo-labeled samples in the target domain based on models trained in the source domain. Our empirical results on four benchmark datasets suggest that our approach significantly enhances the performance of pretrained QA models on the target domain, and even outperforms models that have access to the source data during adaptation.
- PERL: Pivot-based domain adaptation for pre-trained deep contextualized embedding models. Transactions of the Association for Computational Linguistics, 8:504–521.
- Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 120–128, Sydney, Australia. Association for Computational Linguistics.
- Unsupervised domain adaptation on reading comprehension. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7480–7487.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- MRQA 2019 shared task: Evaluating generalization in reading comprehension. In Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pages 1–13, Hong Kong, China. Association for Computational Linguistics.
- Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030.
- Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation [Online], 101(23):e215–e220.
- Generative adversarial networks. Communications of the ACM, 63(11):139–144.
- Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360, Online. Association for Computational Linguistics.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR.
- Learning discrete representations via information maximizing self-augmented training. In International conference on machine learning, pages 1558–1567. PMLR.
- Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. Advances in Neural Information Processing Systems, 34:3635–3649.
- Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
- SemEval-2021 task 10: Source-free domain adaptation for semantic processing. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 348–356, Online. Association for Computational Linguistics.
- DILBERT: Customized pre-training for domain adaptation with category shift, with an application to aspect extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 219–230, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Model adaptation: Unsupervised domain adaptation without source data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9641–9650.
- Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International Conference on Machine Learning, pages 6028–6039. PMLR.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Deep transfer learning with joint adaptation networks. In International conference on machine learning, pages 2208–2217. PMLR.
- Effective self-training for parsing. In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pages 152–159, New York City, USA. Association for Computational Linguistics.
- Leep: A new measure to evaluate transferability of learned representations. In International Conference on Machine Learning, pages 7294–7305. PMLR.
- Unsupervised domain adaptation of language models for reading comprehension. arXiv preprint arXiv:1911.10768.
- SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, Austin, Texas. Association for Computational Linguistics.
- Exploring the limits of transfer learning with a unified text-to-text transformer.
- NewsQA: A machine comprehension dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 191–200, Vancouver, Canada. Association for Computational Linguistics.
- An overview of the bioasq large-scale biomedical semantic indexing and question answering competition. BMC bioinformatics, 16(1):1–28.
- Attention is all you need. Advances in neural information processing systems, 30.
- Adversarial domain adaptation for machine reading comprehension. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2510–2520, Hong Kong, China. Association for Computational Linguistics.
- Dynamically instance-guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24090–24099.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Can we evaluate domain adaptation models without target-domain labels? a metric for unsupervised evaluation of domain adaptation. arXiv preprint arXiv:2305.18712.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics.
- David Yarowsky. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 189–196, Cambridge, Massachusetts, USA. Association for Computational Linguistics.
- When source-free domain adaptation meets learning with noisy labels. arXiv preprint arXiv:2301.13381.
- A fast local citation recommendation algorithm scalable to multi-topics. Expert Systems with Applications, 238:122031.
- Annotated question-answer pairs for clinical notes in the mimic-iii database. https://doi.org/10.13026/j0y6-bw05.
- Synthetic question value estimation for domain adaptation of question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1340–1351, Dublin, Ireland. Association for Computational Linguistics.
- Cliniqg4qa: Generating diverse questions for domain adaptation of clinical question answering. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 580–587. IEEE.
- Contrastive domain adaptation for question answering using limited text corpora. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9575–9593, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Domain-augmented domain adaptation. arXiv preprint arXiv:2202.10000.
- Matching distributions between model and data: Cross-domain knowledge distillation for unsupervised domain adaptation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5423–5433, Online. Association for Computational Linguistics.
- Peide Zhu and Claudia Hauff. 2022. Unsupervised domain adaptation for question generation with DomainData selection and self-training. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2388–2401, Seattle, United States. Association for Computational Linguistics.
- Yftah Ziser and Roi Reichart. 2018. Pivot based language modeling for improved neural domain adaptation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1241–1251, New Orleans, Louisiana. Association for Computational Linguistics.