Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 82 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 19 tok/s Pro

GPT-5 High 17 tok/s Pro

GPT-4o 107 tok/s Pro

Kimi K2 174 tok/s Pro

GPT OSS 120B 468 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Toward Data Efficient Model Merging between Different Datasets without Performance Degradation (2306.05641v2)

Published 9 Jun 2023 in cs.LG and cs.AI

Abstract: Model merging is attracting attention as a novel method for creating a new model by combining the weights of different trained models. While previous studies reported that model merging works well for models trained on a single dataset with different random seeds, model merging between different datasets remains unsolved. In this paper, we attempt to reveal the difficulty in merging such models trained on different datasets and alleviate it. Our empirical analyses show that, in contrast to the single-dataset scenarios, dataset information needs to be accessed to achieve high accuracy when merging models trained on different datasets. However, the requirement to use full datasets not only incurs significant computational costs but also becomes a major limitation when integrating models developed and shared by others. To address this, we demonstrate that dataset reduction techniques, such as coreset selection and dataset condensation, effectively reduce the data requirement for model merging. In our experiments with SPLIT-CIFAR10 model merging, the accuracy is significantly improved by $31%$ when using the full dataset and $24%$ when using the sampled subset compared with not using the dataset.

Citations (3)

View on Semantic Scholar