Majority of chest X-rays (CXRs) performed globally are normal radiologists spend significant time ruling out these scans. We present a Deep Learning (DL) model trained for the specific use of classifying CXRs into normal and abnormal, potentially reducing time and cost associated with reporting normal studies.
A DL algorithm trained on 1,150,084 CXRs and their corresponding reports was developed. A retrospectively acquired independent test set of 430 CXRs (285 abnormal, 145 normal) was analyzed by the algorithm classifying each X-Ray as normal or abnormal. Ground truth for the independent test set was established by a sub-specialist chest radiologist with 8 years’ experience by reviewing every Chest X-Ray image with reference to the existing report. Algorithm output was compared against ground truth and summary statistics were calculated.
The algorithm correctly classified 376 (87.44%) CXRs with sensitivity of 97.19% (95% CI – 94.54% to 98.78%) and specificity of 68.28% (95% CI – 60.04% to 75.75%). There were 46 (10.70%) false positives and 8 (1.86%) false negatives (FNs). Out of 8 FNs, 3 were designated as clinically insignificant (mild, inactive fibrosis) and 5 as significant (rib fractures, pneumothorax).
High-sensitivity DL algorithms can potentially be deployed for primary read of CXRs enabling radiologists to spend appropriate time on abnormal cases, saving time and thereby cost of reporting CXRs, especially on non-emergency situations, More in-depth prospective trials are required to ascertain the overall impact of such algorithms.