To validate set of deep learning algorithms for automated detection of key findings from nonÂcontrast head-CT scans: intracranial hemorrhage and its subtypes, calvarial fractures, midline shift and mass effect.
We retrospectively collected a dataset containing 313,318 head-CT scans of which random subset(Qure25k dataset) was used to validate and rest to develop algorithms. Additional dataset(CQ500 dataset) was collected from different centers to validate algorithms. Patients with postÂoperative defect or age<7 were excluded from all datasets. Three independent radiologists read each scan in CQ500 dataset. Original clinical radiology report and consensus of readers were considered as gold standards for Qure25k and CQ500 datasets respectively. Areas under receiver operating characteristics curves(AUCs) were used to evaluate algorithms.
After exclusion, Qure25k dataset contained 21,095 scans(mean-age 43;43% female) while CQ500 dataset consisted of 491(mean-age 48;36% female) scans. On Qure25k dataset, algorithms achieved an AUC of 0.92 for detecting intracranial hemorrhage(0.90-intraparenchymal, 0.96-intraventricular, 0.92-subdural, 0.93-extradural, and 0.90-subarachnoid hemorrhages). On CQ500 dataset, AUC was 0.94 for intracranial haemorrhage(0.95, 0.93, 0.95, 0.97, and 0.96 respectively). AUCs on Qure25k dataset were 0.92 for calvarial fractures, 0.93 for midline shift, and 0.86 for mass effect, while AUCs on CQ500 dataset were 0.96, 0.97 and 0.92 respectively.
This study demonstrates that deep learning algorithms can identify head-CT scan abnormalities requiring urgent attention with high AUCs.