Papers
Topics
Authors
Recent
Search
2000 character limit reached

Acoustic Scene Classification Using Fusion of Attentive Convolutional Neural Networks for DCASE2019 Challenge

Published 13 Jul 2019 in eess.AS and cs.SD | (1907.07127v1)

Abstract: In this report, the Brno University of Technology (BUT) team submissions for Task 1 (Acoustic Scene Classification, ASC) of the DCASE-2019 challenge are described. Also, the analysis of different methods is provided. The proposed approach is a fusion of three different Convolutional Neural Network (CNN) topologies. The first one is a VGG like two-dimensional CNNs. The second one is again a two-dimensional CNN network which uses Max-Feature-Map activation and called Light-CNN (LCNN). The third network is a one-dimensional CNN which mainly used for speaker verification and called x-vector topology. All proposed networks use self-attention mechanism for statistic pooling. As a feature, we use a 256-dimensional log Mel-spectrogram. Our submissions are a fusion of several networks trained on 4-folds generated evaluation setup using different fusion strategies.

Citations (6)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.