Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Confidence Bounds for Classification with Imbalanced Data (2407.11878v2)

Published 16 Jul 2024 in cs.LG

Abstract: Class imbalance poses a significant challenge in classification tasks, where traditional approaches often lead to biased models and unreliable predictions. Undersampling and oversampling techniques have been commonly employed to address this issue, yet they suffer from inherent limitations stemming from their simplistic approach such as loss of information and additional biases respectively. In this paper, we propose a novel framework that leverages learning theory and concentration inequalities to overcome the shortcomings of traditional solutions. We focus on understanding the uncertainty in a class-dependent manner, as captured by confidence bounds that we directly embed into the learning process. By incorporating class-dependent estimates, our method can effectively adapt to the varying degrees of imbalance across different classes, resulting in more robust and reliable classification outcomes. We empirically show how our framework provides a promising direction for handling imbalanced data in classification tasks, offering practitioners a valuable tool for building more accurate and trustworthy models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Matt Clifford (2 papers)
  2. Jonathan Erskine (2 papers)
  3. Alexander Hepburn (18 papers)
  4. Dario Garcia-Garcia (4 papers)
  5. Raúl Santos-Rodríguez (11 papers)

Summary

We haven't generated a summary for this paper yet.