Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

All-for-One and One-For-All: Deep learning-based feature fusion for Synthetic Speech Detection (2307.15555v1)

Published 28 Jul 2023 in cs.SD, cs.CL, cs.CR, and eess.AS

Abstract: Recent advances in deep learning and computer vision have made the synthesis and counterfeiting of multimedia content more accessible than ever, leading to possible threats and dangers from malicious users. In the audio field, we are witnessing the growth of speech deepfake generation techniques, which solicit the development of synthetic speech detection algorithms to counter possible mischievous uses such as frauds or identity thefts. In this paper, we consider three different feature sets proposed in the literature for the synthetic speech detection task and present a model that fuses them, achieving overall better performances with respect to the state-of-the-art solutions. The system was tested on different scenarios and datasets to prove its robustness to anti-forensic attacks and its generalization capabilities.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Daniele Mari (7 papers)
  2. Davide Salvi (15 papers)
  3. Paolo Bestagini (61 papers)
  4. Simone Milani (19 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.