An Efficient Association Rule Mining From Distributed Medical Databases for Predicting Heart Diseases

An Efficient Association Rule Mining From Distributed Medical Databases for Predicting Heart Diseases

Abstract:

Electronic Health Records (EHRs) are aggregated, combined and analyzed for suitable treatment planning and safe therapeutic procedures of patients. Integrated EHRs facilitate the examination, diagnosis and treatment of diseases. However, the existing EHRs models are centralized. There are several obstacles that limit the proliferation of centralized EHRs, such as data size, privacy and data ownership consideration. In this paper, we propose a novel methodology and algorithm to handle the mining of distributed medical data sources at different sites (hospitals and clinics) using Association Rules. These medical data resources cannot be moved to other network sites. Therefore, the desired global computation must be decomposed into local computations to match the distribution of data across the network. The capability to decompose computations must be general enough to handle different distributions of data and different participating nodes in each instance of the global computation. In the proposed methodology, each distributed data source is represented by an agent. The global association rule computation is then performed by the agent either exchanging some minimal summaries with other agents or travelling to all the sites and performing local tasks that can be done at each local site. The objective is to perform global tasks with a minimum of communication or travel by participating agents across the network, this will preserve the privacy and the security of the local data. The proposed association rule mining methodology will be used for heart disease prediction using real heart disease data. These real data exist at different clinics and cannot be moved to a central site. The proposed model protects the patient data privacy and achieves the same results as if the data are moved and joined at a central site. We also validate the extracted association rules from all the data providers using an independent test datasets.