MIMIC-IV-Ext-CEKG

The MIMIC-IV-Ext-CEKG dataset is a curated extension of the MIMIC-IV dataset, comprising 19 integrated tables. MIMIC-IV contains clinical data collected from 2008 to 2019 for patients admitted to intensive care units (ICUs) at Beth Israel Deaconess Medical Center in Boston, Massachusetts.

Key Characteristics of MIMIC-IV-Ext-CEKG

  • No preprocessing required for process mining, machine learning, or data mining tasks.
  • Object-centric dataset built using five distinct objects, suitable for object-centric and multi-dimensional process mining.
  • Time series-compatible data structure, ideal for time series analysis and machine learning applications.
  • Supports construction of Event Knowledge Graphs.
  • Corrected National Drug Codes (NDC) from the original dataset for improved accuracy.
  • Includes 95 distinct clinical activities, covering a wide range from lab tests and diagnoses to monitoring and discharge activities, each with rich features and values.
  • ICD-9 codes from the original dataset are corrected and mapped to ICD-10 for better interpretability and global compatibility.
  • Integration of SNOMED CT codes, which are not available in the original MIMIC dataset, enabling standardized terminology use.