Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 159 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 118 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Detect Language of Transliterated Texts (2004.13521v1)

Published 26 Apr 2020 in eess.AS, cs.CL, cs.LG, cs.SD, and stat.ML

Abstract: Informal transliteration from other languages to English is prevalent in social media threads, instant messaging, and discussion forums. Without identifying the language of such transliterated text, users who do not speak that language cannot understand its content using translation tools. We propose a Language Identification (LID) system, with an approach for feature extraction, which can detect the language of transliterated texts reasonably well even with limited training data and computational resources. We tokenize the words into phonetic syllables and use a simple Long Short-term Memory (LSTM) network architecture to detect the language of transliterated texts. With intensive experiments, we show that the tokenization of transliterated words as phonetic syllables effectively represents their causal sound patterns. Phonetic syllable tokenization, therefore, makes it easier for even simpler model architectures to learn the characteristic patterns to identify any language.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Questions

We haven't generated a list of open questions mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.