Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks (2212.08489v2)

Published 16 Dec 2022 in cs.CL, cs.AI, cs.SD, and eess.AS

Abstract: In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable performance of different state-of-the-art SLU systems under different circumstances, e.g., automatically- vs. manually-generated transcripts. We evaluate the systems on the publicly available SLURP spoken language resource corpus. Our results indicate that using richer forms of Automatic Speech Recognition (ASR) outputs, namely word-consensus-networks, allows the SLU system to improve in comparison to the 1-best setup (5.5% relative improvement). However, crossmodal approaches, i.e., learning from acoustic and text embeddings, obtains performance similar to the oracle setup, a relative improvement of 17.8% over the 1-best configuration, being a recommended alternative to overcome the limitations of working with automatically generated transcripts.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Srikanth Madikeri (19 papers)
  2. Juan Zuluaga-Gomez (27 papers)
  3. Bidisha Sharma (11 papers)
  4. Seyyed Saeed Sarfjoo (8 papers)
  5. Iuliia Nigmatulina (14 papers)
  6. Petr Motlicek (40 papers)
  7. Alexei V. Ivanov (1 paper)
  8. Aravind Ganapathiraju (13 papers)
  9. Esaú Villatoro-Tello (19 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.