Abstract:
With the advances in deep learning, including Convolutional Neural Networks (CNN), automated diagnosis technology using medical images has received considerable attention in medical science. In particular, in the field of ultrasound imaging, CNN trains the features of organs through an amount of image data, so that an expert-level automatic diagnosis is possible only with images of actual patients. However, CNN models are also trained on the features that reflect the inherent bias of the imaging machine used for image acquisition. In other words, when the domain of data used for training is different from that of data applied for an actual diagnosis, it is unclear whether consistent performance can be provided by the domain bias. Therefore, we investigate the effect of domain bias on the model with liver ultrasound imaging data obtained from multiple domains. We have constructed a dataset considering the manufacturer and the year of manufacturing of 8 ultrasound imaging machines. First, training and testing were performed by dividing the entire data, in a commonly used method. Second, we have utilized the training data constructed according to the number of domains for the machine learning process. Then we have measured and compared the performance on internal and external domain data. Through the above experiment, we have analyzed the effect of domains of data on model performance. We show that the performance scores evaluated with the internal domain data and the external domain data do not match. We especially show that the performance measured in the evaluation data including the internal domain was much higher than the performance measured in the evaluation data consisting of the external domain. We also show that 3-level classification performance is slightly improved over 5-level classification by mitigating class imbalance by integrating similar classes. The results highlight the need to develop a new methodology for mitigating the machine bias problem so that the model can work correctly even on external domain data, as opposed to the usual approach of constructing evaluation data in the same domain as the training data.