Deep learning-based radiological image analysis could facilitate use of chest x-rays as triage tests for pulmonary tuberculosis in resource-limited settings. We sought to determine whether commercially available chest x-ray analysis software meet WHO recommendations for minimal sensitivity and specificity as pulmonary tuberculosis triage tests.
We recruited symptomatic adults at the Indus Hospital, Karachi, Pakistan. We compared two software, qXR version 2.0 (qXRv2) and CAD4TB version 6.0 (CAD4TBv6), with a reference of mycobacterial culture of two sputa. We assessed qXRv2 using its manufacturer prespecified threshold score for chest x-ray classification as tuberculosis present versus not present. For CAD4TBv6, we used a data-derived threshold, because it does not have a prespecified one. We tested for non-inferiority to preset WHO recommendations (0·90 for sensitivity, 0·70 for specificity) using a non-inferiority limit of 0·05. We identified factors associated with accuracy by stratification and logistic regression.
We included 2198 (92·7%) of 2370 enrolled participants. 2187 (99·5%) of 2198 were HIV-negative, and 272 (12·4%) had culture-confirmed pulmonary tuberculosis. For both software, accuracy was non-inferior to WHO-recommended minimum values (qXRv2 sensitivity 0·93 [95% CI 0·89–0·95], non-inferiority p=0·0002; CAD4TBv6 sensitivity 0·93 [95% CI 0·90–0·96], p<0·0001; qXRv2 specificity 0·75 [95% CI 0·73–0·77], p<0·0001; CAD4TBv6 specificity 0·69 [95% CI 0·67–0·71], p=0·0003). Sensitivity was lower in smear-negative pulmonary tuberculosis for both software, and in women for CAD4TBv6. Specificity was lower in men and in those with previous tuberculosis, and reduced with increasing age and decreasing body mass index. Smoking and diabetes did not affect accuracy.
In an HIV-negative population, these software met WHO-recommended minimal accuracy for pulmonary tuberculosis triage tests. Sensitivity will be lower when smear-negative pulmonary tuberculosis is more prevalent.