Emergent Mind

Abstract

In chest X-ray (CXR) image analysis, rule-based systems are usually employed to extract labels from reports, but concerns exist about label quality. These datasets typically offer only presence labels, sometimes with binary uncertainty indicators, which limits their usefulness. In this work, we present MAPLEZ (Medical report Annotations with Privacy-preserving Large language model using Expeditious Zero shot answers), a novel approach leveraging a locally executable Large Language Model (LLM) to extract and enhance findings labels on CXR reports. MAPLEZ extracts not only binary labels indicating the presence or absence of a finding but also the location, severity, and radiologists' uncertainty about the finding. Over eight abnormalities from five test sets, we show that our method can extract these annotations with an increase of 5 percentage points (pp) in F1 score for categorical presence annotations and more than 30 pp increase in F1 score for the location annotations over competing labelers. Additionally, using these improved annotations in classification supervision, we demonstrate substantial advancements in model quality, with an increase of 1.7 pp in AUROC over models trained with annotations from the state-of-the-art approach. We share code and annotations.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.