Predicting septic shock outcomes in a database with missing data using fuzzy modeling: Influence of pre-processing techniques on real-world data-based classification
Real-world databases often contain missing data and existing correction algorithms deliver varying performance. Also, most modeling techniques are not suitable to deal with them automatically. In this study we examine different approaches to predicting septic shock in the presence of missing data. Some preprocessing techniques for managing missing data include disregarding data, or replacing it with information that by design introduces bias. In this study, we show that predictive performance improves by employing a minimum pre-processing technique, the Zero-Order-Hold (ZOH) method, by applying a Fuzzy C-Means clustering technique based on the partial distance calculation strategy (FCM-PDS) and by computing the final classification regarding the samples from each patient. Performance improvements continue to occur where up to approximately 60% of the data is missing, though for higher percentage the classification performance still is statistically improved. We further validate this approach by making comparisons with previous studies.