Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 49 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 472 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Data Augmentation via Dependency Tree Morphing for Low-Resource Languages (1903.09460v1)

Published 22 Mar 2019 in cs.CL

Abstract: Neural NLP systems achieve high scores in the presence of sizable training dataset. Lack of such datasets leads to poor system performances in the case low-resource languages. We present two simple text augmentation techniques using dependency trees, inspired from image processing. We crop sentences by removing dependency links, and we rotate sentences by moving the tree fragments around the root. We apply these techniques to augment the training sets of low-resource languages in Universal Dependencies project. We implement a character-level sequence tagging model and evaluate the augmented datasets on part-of-speech tagging task. We show that crop and rotate provides improvements over the models trained with non-augmented data for majority of the languages, especially for languages with rich case marking systems.

Citations (112)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.