Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interpretable Text Classification Using CNN and Max-pooling (1910.11236v1)

Published 14 Oct 2019 in cs.CL

Abstract: Deep neural networks have been widely used in text classification. However, it is hard to interpret the neural models due to the complicate mechanisms. In this work, we study the interpretability of a variant of the typical text classification model which is based on convolutional operation and max-pooling layer. Two mechanisms: convolution attribution and n-gram feature analysis are proposed to analyse the process procedure for the CNN model. The interpretability of the model is reflected by providing posterior interpretation for neural network predictions. Besides, a multi-sentence strategy is proposed to enable the model to beused in multi-sentence situation without loss of performance and interpret ability. We evaluate the performance of the model on several classification tasks and justify the interpretable performance with some case studies.

Citations (5)

Summary

  • The paper introduces convolution attribution to pinpoint influential text segments driving classification decisions.
  • It employs n-gram feature analysis to reveal the key input patterns that significantly affect the CNN’s predictions.
  • A multi-sentence strategy is proposed to maintain both high performance and interpretability for longer text inputs.

The paper "Interpretable Text Classification Using CNN and Max-pooling" addresses the challenge of interpretability in deep neural networks for text classification. While these models are effective, understanding their predictions can be difficult due to their complex mechanisms. This work explores interpretability within a specific model architecture that utilizes convolutional neural networks (CNNs) combined with a max-pooling layer.

Key Contributions:

  1. Convolution Attribution:
    • The authors introduce a method called convolution attribution, which aims to provide insights into which parts of the input text contribute most significantly to the output classification. This is achieved by analyzing the activation patterns within the convolutional layers, identifying critical regions that influence the model’s predictions.
  2. N-gram Feature Analysis:
    • This mechanism focuses on analyzing n-grams (contiguous sequences of n items from a given sample of text) to determine which features the CNN leverages for classification. By understanding which n-grams are most impactful, the model’s decision-making process becomes more transparent.
  3. Multi-Sentence Strategy:
    • The paper proposes a strategy to extend the model’s applicability to multi-sentence inputs. This approach ensures that the performance and interpretability are maintained even when dealing with longer texts.
  4. Evaluation and Case Studies:
    • The effectiveness of the proposed mechanisms is validated through various text classification tasks. The authors provide case studies to demonstrate how their approach enhances interpretability. These studies offer concrete examples of how attributions and feature analyses can be used to dissect and understand CNN predictions.

Results:

  • The authors report that their interpretability techniques help in better understanding the model's behavior without sacrificing performance. By providing posterior interpretation for predictions, they aim to make neural network models more transparent and trustworthy in practical applications.

This paper contributes significantly to the field by balancing the trade-off between performance and interpretability, a crucial consideration for deploying neural networks in real-world scenarios where explainability is important.