Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Novel Data-Driven Framework for Risk Characterization and Prediction from Electronic Medical Records: A Case Study of Renal Failure (1711.11022v1)

Published 29 Nov 2017 in cs.LG and stat.AP

Abstract: Electronic medical records (EMR) contain longitudinal information about patients that can be used to analyze outcomes. Typically, studies on EMR data have worked with established variables that have already been acknowledged to be associated with certain outcomes. However, EMR data may also contain hitherto unrecognized factors for risk association and prediction of outcomes for a disease. In this paper, we present a scalable data-driven framework to analyze EMR data corpus in a disease agnostic way that systematically uncovers important factors influencing outcomes in patients, as supported by data and without expert guidance. We validate the importance of such factors by using the framework to predict for the relevant outcomes. Specifically, we analyze EMR data covering approximately 47 million unique patients to characterize renal failure (RF) among type 2 diabetic (T2DM) patients. We propose a specialized L1 regularized Cox Proportional Hazards (CoxPH) survival model to identify the important factors from those available from patient encounter history. To validate the identified factors, we use a specialized generalized linear model (GLM) to predict the probability of renal failure for individual patients within a specified time window. Our experiments indicate that the factors identified via our data-driven method overlap with the patient characteristics recognized by experts. Our approach allows for scalable, repeatable and efficient utilization of data available in EMRs, confirms prior medical knowledge and can generate new hypothesis without expert supervision.

Summary

We haven't generated a summary for this paper yet.