2000 character limit reached
Simulating Lexical Semantic Change from Sense-Annotated Data (2001.03216v1)
Published 9 Jan 2020 in cs.CL
Abstract: We present a novel procedure to simulate lexical semantic change from synchronic sense-annotated data, and demonstrate its usefulness for assessing lexical semantic change detection models. The induced dataset represents a stronger correspondence to empirically observed lexical semantic change than previous synthetic datasets, because it exploits the intimate relationship between synchronic polysemy and diachronic change. We publish the data and provide the first large-scale evaluation gold standard for LSC detection models.