A New Technique for Combining Multiple Classifiers using The Dempster-Shafer Theory of Evidence (1107.0018v1)

Published 30 Jun 2011 in cs.AI

Abstract: This paper presents a new classifier combination technique based on the Dempster-Shafer theory of evidence. The Dempster-Shafer theory of evidence is a powerful method for combining measures of evidence from different classifiers. However, since each of the available methods that estimates the evidence of classifiers has its own limitations, we propose here a new implementation which adapts to training data so that the overall mean square error is minimized. The proposed technique is shown to outperform most available classifier combination methods when tested on three different classification problems.

Citations (191)

View on Semantic Scholar

Summary

The paper introduces a novel classifier fusion method using Dempster-Shafer theory to adaptively minimize mean square error.
It employs gradient descent to calibrate evidence assignments by integrating complete output vectors for more precise predictions.
Evaluated on texture, speech, and speaker recognition tasks, the method improved error reduction rates by 2-7% over conventional techniques.

A Novel Approach for Classifier Combination Utilizing the Dempster-Shafer Theory of Evidence

The paper "A New Technique for Combining Multiple Classifiers using The Dempster-Shafer Theory of Evidence" by Ahmed Al-Ani and Mohamed Deriche introduces a methodological advancement in the field of pattern recognition. Rooted in the Dempster-Shafer (D-S) theory, this novel approach targets the amalgamation of multiple classifiers to enhance classification accuracy. This methodology proposes an adaptive framework that minimizes the overall mean square error (MSE), thus addressing the inherent limitations of existing evidence estimation techniques associated with the D-S theory.

Overview of Dempster-Shafer Theory

The D-S theory provides a robust mathematical framework for dealing with uncertainty, distinguishing it from other statistical theories such as Bayesian inference. A major advantage of D-S theory is its capacity to handle ignorance and uncertainty without necessitating forced precision in probabilistic estimations. In this paper, the authors leverage the transferable belief model from the D-S theory to create a mechanism that efficiently combines evidence from various classifiers.

Existing Methods and Their Challenges

Prior approaches, such as those utilizing the weighted sum rule or Rogova's D-S application, demonstrated certain drawbacks. For instance, these methods often involved arbitrary choice of parameters or reliance on secondary metrics like proximity measures for evidence estimation, which did not consistently align with optimal performance. Such limitations underscored the necessity for an evidence estimation methodology that adapts to the training data and offers scalable improvements in classification accuracy without approximation errors.

Proposed Technique

Central to the authors’ method is an optimization process based on iterative procedures aimed at minimizing the MSE between expected and target outputs in a training set. By employing the gradient descent algorithm, this proposed technique dynamically calibrates parameters, specifically the evidence assignments for each classifier. Compared to other adaptive methods, this approach benefits from considering the complete output vector rather than merely scalar values. This inclusion markedly affects the precision of combined classifier outputs, offering a more detailed and informative dataset integration.

Performance Evaluation

The comprehensive evaluation covered multiple domains: texture recognition, speech segment classification, and speaker identification. Across these domains, the proposed method demonstrated superior performance relative to conventional and alternative evidence combination methods. For instance, the improvement in the error reduction rate (ERR) stood at 2-7% above baseline methods, with performance gaps widening in scenarios with complex class arrangements or diverse classifier functionalities.

Implications and Future Work

This research not only postulates an approach offering immediate practical implications in the domains of automated classification but also advances theoretical discussions around evidence combination and uncertainty management. Moreover, while computationally intensive during the training phase—owing to the thorough parameter tuning involving the entire output space—this method’s execution in real-time classification tasks remains resource-efficient since training is conducted offline.

Future developments could aim at refining the optimization algorithms to further reduce computational complexity, thereby extending applicability to larger datasets or real-time adaptive systems. Additionally, expanding the framework to incorporate dynamic weight recalibration might enhance real-time accuracy capabilities for adaptive learning systems.

In conclusion, this paper illustrates the potency of employing the D-S theory in classifier combination tasks, setting a potential benchmark for future advancements in classifier performance enhancements. This methodological advancement denotes a promising avenue for expanding both the scope and efficacy of pattern recognition systems.