Cross-site validation of lung cancer diagnosis by electronic nose with deep learning: a multicenter prospective study

Sustainable Development Goals

Abstract/Objectives

This study investigates the use of electronic nose (eNose) technology for diagnosing lung cancer, addressing the challenge of cross-site validation that has not been previously explored. Patients with lung cancer and healthy or diseased controls were recruited from two different centers between 2019 and 2022. Deep learning models were developed using data from one site and tested on data from the other. The study involved 231 participants, with initial model performance showing lower accuracy on the test cohort (AUC 0.61). However, performance significantly improved after applying data augmentation methods, achieving AUCs of 0.89 and 0.90, and further enhancements were noted with fine-tuning, reaching AUCs of 0.95. The results indicate that eNose breathprints can effectively diagnose lung cancer across sites, presenting a non-invasive and scalable option for detection. The study does not register as a clinical trial.

Results/Contributions

Background: Although electronic nose (eNose) has been intensively investigated for diagnosing lung cancer, cross-site validation remains a major obstacle to be overcome and no studies have yet been performed. Methods: Patients with lung cancer, as well as healthy control and diseased control groups, were prospectively recruited from two referral centers between 2019 and 2022. Deep learning models for detecting lung cancer with eNose breathprint were developed using training cohort from one site and then tested on cohort from the other site. Semi-Supervised Domain-Generalized (Semi-DG) Augmentation (SDA) and Noise-Shift Augmentation (NSA) methods with or without fine-tuning was applied to improve performance. Results: In this study, 231 participants were enrolled, comprising a training/validation cohort of 168 individuals (90 with lung cancer, 16 healthy controls, and 62 diseased controls) and a test cohort of 63 individuals (28 with lung cancer, 10 healthy controls, and 25 diseased controls). The model has satisfactory results in the validation cohort from the same hospital while directly applying the trained model to the test cohort yielded suboptimal results (AUC, 0.61, 95% CI: 0.47─0.76). The performance improved after applying data augmentation methods in the training cohort (SDA, AUC: 0.89 [0.81─0.97]; NSA, AUC:0.90 [0.89─1.00]). Additionally, after applying fine-tuning methods, the performance further improved (SDA plus fine-tuning, AUC:0.95 [0.89─1.00]; NSA plus fine-tuning, AUC:0.95 [0.90─1.00]). Conclusion: Our study revealed that deep learning models developed for eNose breathprint can achieve cross-site validation with data augmentation and fine-tuning. Accordingly, eNose breathprints emerge as a convenient, non-invasive, and potentially generalizable solution for lung cancer detection. Clinical trial registration: This study is not a clinical trial and was therefore not registered. © The Author(s) 2024.

Keywords

eNoselung cancerdiagnosiscross-site validationdeep learning modelSemi-DGdata augmentationperformancefine-tuninghealthy controldisease controlparticipantstraining cohorttest cohortclinical trial

References

1. NA

Contact Information

鄭桂忠

kttang@ee.nthu.edu.tw