MIMIC-IV-Ext-CEKG
The MIMIC-IV-Ext-CEKG dataset is a curated extension of the MIMIC-IV dataset, comprising 19 integrated tables. MIMIC-IV contains clinical data collected from 2008 to 2019 for patients admitted to intensive care units (ICUs) at Beth Israel Deaconess Medical Center in Boston, Massachusetts.
Key Characteristics of MIMIC-IV-Ext-CEKG
- No preprocessing required for process mining, machine learning, or data mining tasks.
- Object-centric dataset built using five distinct objects, suitable for object-centric and multi-dimensional process mining.
- Time series-compatible data structure, ideal for time series analysis and machine learning applications.
- Supports construction of Event Knowledge Graphs.
- Corrected National Drug Codes (NDC) from the original dataset for improved accuracy.
- Includes 95 distinct clinical activities, covering a wide range from lab tests and diagnoses to monitoring and discharge activities, each with rich features and values.
- ICD-9 codes from the original dataset are corrected and mapped to ICD-10 for better interpretability and global compatibility.
- Integration of SNOMED CT codes, which are not available in the original MIMIC dataset, enabling standardized terminology use.