Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Continuous Space Neural Language Model for Bengali Language (2001.05315v1)

Published 11 Jan 2020 in cs.CL and cs.LG

Abstract: LLMs are generally employed to estimate the probability distribution of various linguistic units, making them one of the fundamental parts of natural language processing. Applications of LLMs include a wide spectrum of tasks such as text summarization, translation and classification. For a low resource language like Bengali, the research in this area so far can be considered to be narrow at the very least, with some traditional count based models being proposed. This paper attempts to address the issue and proposes a continuous-space neural LLM, or more specifically an ASGD weight dropped LSTM LLM, along with techniques to efficiently train it for Bengali Language. The performance analysis with some currently existing count based models illustrated in this paper also shows that the proposed architecture outperforms its counterparts by achieving an inference perplexity as low as 51.2 on the held out data set for Bengali.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Hemayet Ahmed Chowdhury (4 papers)
  2. Md. Azizul Haque Imon (1 paper)
  3. Anisur Rahman (5 papers)
  4. Aisha Khatun (9 papers)
  5. Md. Saiful Islam (57 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.