Why Neural Machine Translation Prefers Empty Outputs (2012.13454v1)

Published 24 Dec 2020 in cs.CL

Abstract: We investigate why neural machine translation (NMT) systems assign high probability to empty translations. We find two explanations. First, label smoothing makes correct-length translations less confident, making it easier for the empty translation to finally outscore them. Second, NMT systems use the same, high-frequency EoS word to end all target sentences, regardless of length. This creates an implicit smoothing that increases zero-length translations. Using different EoS types in target sentences of different lengths exposes and eliminates this implicit smoothing.

Citations (9)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Why Neural Machine Translation Prefers Empty Outputs (2012.13454v1)

Summary

Related Papers