Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 159 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 352 tok/s Pro
Claude Sonnet 4.5 33 tok/s Pro
2000 character limit reached

Auxiliary Learning for Deep Multi-task Learning (1909.02214v2)

Published 5 Sep 2019 in cs.CV and cs.LG

Abstract: Multi-task learning (MTL) is an efficient solution to solve multiple tasks simultaneously in order to get better speed and performance than handling each single-task in turn. The most current methods can be categorized as either: (i) hard parameter sharing where a subset of the parameters is shared among tasks while other parameters are task-specific; or (ii) soft parameter sharing where all parameters are task-specific but they are jointly regularized. Both methods suffer from limitations: the shared hidden layers of the former are difficult to optimize due to the competing objectives while the complexity of the latter grows linearly with the increasing number of tasks. To mitigate those drawbacks, this paper proposes an alternative, where we explicitly construct an auxiliary module to mimic the soft parameter sharing for assisting the optimization of the hard parameter sharing layers in the training phase. In particular, the auxiliary module takes the outputs of the shared hidden layers as inputs and is supervised by the auxiliary task loss. During training, the auxiliary module is jointly optimized with the MTL network, serving as a regularization by introducing an inductive bias to the shared layers. In the testing phase, only the original MTL network is kept. Thus our method avoids the limitation of both categories. We evaluate the proposed auxiliary module on pixel-wise prediction tasks, including semantic segmentation, depth estimation, and surface normal prediction with different network structures. The extensive experiments over various settings verify the effectiveness of our methods.

Citations (10)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.