Lung Cancer Subtype Diagnosis by Fusing Image Genomics Data and Hybrid Deep Networks

Lung Cancer Subtype Diagnosis by Fusing Image Genomics Data and Hybrid Deep Networks

Abstract:

Accurate diagnosis of cancer subtypes is crucial for precise treatment, because different cancer subtypes are involved with different pathology and require different therapies. Although deep learning techniques have made great success in computer vision and other fields, they do not work well on Lung cancer subtype diagnosis, due to the distinction of slide images between different cancer subtypes is ambiguous. Furthermore, they often over-fit to high-dimensional genomics data with limited samples, and do not fuse the image and genomics data in a sensible way. In this paper, we propose a hybrid deep network based approach LungDIG for Lung cancer subtype D iagnosis by fusing I mage- G enomics data. LungDIG first tiles the tissue slide image into small patches and extracts the patch-level features by fine-tuning an Inception-V3 model. Since the patches may contain some false positives in non-diagnostic regions, it further designs a patch-level feature combination strategy to integrate the extracted patch features and maintain the diversity between different cancer subtypes. At the same time, it extracts the genomics features from Copy Number Variation data by an attention based nonlinear extractor. Next, it fuses the image and genomics features by an attention based multilayer perceptron (MLP) to diagnose cancer subtype. Experiments on TCGA lung cancer data show that LungDIG can not only achieve higher accuracy for cancer subtype diagnosis than state-of-the-art methods, but also have a high authenticity and good interpretability.