Chest x-rays are widely used to identify pulmonary consolidation because they are highly accessible, cheap and sensitive. Automating the diagnosis in chest x-rays can reduce diagnostic delay, especially in resource-limited settings.
Anonymised dataset of 423,218 chest x-rays with corresponding reports (collected from 166 centres across India spanning 22 x-ray machine variants from 9 manufacturers) is used for training and validation. x-rays with consolidation are identified from their reports using natural language processing techniques. Images are preprocessed to a standard size and normalised to remove source dependency. These images are trained using deep residual neural networks. Multiple models are trained on various selective subsets of the dataset along with one model trained on entire data set. Scores yielded by each of these models is passed through a 2-layer neural network to generate final probabilities for presence of consolidation in an x-ray.
The model is validated and tested on a test dataset that is uniformly sampled from the parent dataset without any exclusion criteria. Sensitivity and specificity for the tag has been observed as 0.81 and 0.80, respectively. Area under the Receiver Operating Curve (AUC-ROC) was observed as 0.88.
Deep learning can be used to diagnose pulmonary consolidation in chest x-rays with models trained on a generalised dataset with samples from multiple demographics. This model performs better than a model trained on controlled dataset and is suited for a real world setting where x-ray quality may not be consistent.