Secure Distributed Deduplication Systems with Improved Reliability

Secure Distributed Deduplication Systems with Improved Reliability

Secure Distributed Deduplication Systems with Improved Reliability
Secure Distributed Deduplication Systems with Improved Reliability


Data deduplication is a technique for eliminating duplicate copies of data, and has been widely used in cloud storage to reduce storage space and upload bandwidth. However, there is only one copy for each file stored in cloud even if such a file is owned by a huge number of users. As a result, deduplication system improves storage utilization while reducing reliability. Furthermore, the challenge of privacy for sensitive data also arises when they are outsourced by users to cloud. Aiming to address the above security challenges, this paper makes the first attempt to formalize the notion of distributed reliable deduplication system. We propose new distributed deduplication systems with higher reliability in which the data chunks are distributed across multiple cloud servers. The security requirements of data confidentiality and tag consistency are also achieved by introducing a deterministic secret sharing scheme in distributed storage systems, instead of using convergent encryption as in previous deduplication systems. Security analysis demonstrates that our deduplication systems are secure in terms of the definitions specified in the proposed security model. As a proof of concept, we implement the proposed systems and demonstrate that the incurred overhead is very limited in realistic environments.


  • A number of deduplication systems have been proposed based on various deduplication strategies such as client-side or server-side deduplications, file-level or block-level deduplications.
  • Bellare et al. formalized this primitive as message-locked encryption, and explored its application in space efficient secure outsourced storage. There are also several implementations of convergent implementations of different convergent encryption variants for secure deduplication.
  • Li addressed the key-management issue in block-level deduplication by distributing these keys across multiple servers after encrypting the files.
  • Bellare et al. showed how to protect data confidentiality by transforming the predicatable message into a unpredicatable message.


  • Data reliability is actually a very critical issue in a deduplication storage system because there is only one copy for each file stored in the server shared by all the owners.
  • Most of the previous deduplication systems have only been considered in a single-server setting.
  • The traditional deduplication methods cannot be directly extended and applied in distributed and multi-server systems.


  • In this paper, we show how to design secure deduplication systems with higher reliability in cloud computing. We introduce the distributed cloud storage servers into deduplication systems to provide better fault tolerance.
  • To further protect data confidentiality, the secret sharing technique is utilized, which is also compatible with the distributed storage systems. In more details, a file is first split and encoded into fragments by using the technique of secret sharing, instead of encryption mechanisms. These shares will be distributed across multiple independent storage servers.
  • Furthermore, to support deduplication, a short cryptographic hash value of the content will also be computed and sent to each storage server as the fingerprint of the fragment stored at each server.
  • Only the data owner who first uploads the data is required to compute and distribute such secret shares, while all following users who own the same data copy do not need to compute and store these shares any more.
  • To recover data copies, users must access a minimum number of storage servers through authentication and obtain the secret shares to reconstruct the data. In other words, the secret shares of data will only be accessible by the authorized users who own the corresponding data copy.
  • Four new secure deduplication systems are proposed to provide efficient deduplication with high reliability for file-level and block-level deduplication, respectively. The secret splitting technique, instead of traditional encryption methods, is utilized to protect data confidentiality. Specifically, data are split into fragments by using secure secret sharing schemes and stored at different servers.


  • Distinguishing feature of our proposal is that data integrity, including tag consistency, can be achieved.
  • To our knowledge, no existing work on secure deduplication can properly address the reliability and tag consistency problem in distributed storage systems.
  • Our proposed constructions support both file-level and block-level deduplications.
  • Security analysis demonstrates that the proposed deduplication systems are secure in terms of the definitions specified in the proposed security model. In more details, confidentiality, reliability and integrity can be achieved in our proposed system. Two kinds of collusion attacks are considered in our solutions. These are the collusion attack on the data and the collusion attack against servers. In particular, the data remains secure even if the adversary controls a limited number of storage servers.
  • We implement our deduplication systems using the Ramp secret sharing scheme that enables high reliability and confidentiality levels. Our evaluation results demonstrate that the new proposed constructions are efficient and the redundancies are optimized and comparable with the other storage system supporting the same level of reliability.



  • System :         Pentium IV 2.4 GHz.
  • Hard Disk :         40 GB.
  • Floppy Drive : 44 Mb.
  • Monitor : 15 VGA Colour.
  • Mouse :
  • Ram : 512 Mb.


  • Operating system : Windows XP/7.
  • Coding Language : JAVA/J2EE
  • IDE : Netbeans 7.4
  • Database : MYSQL


Jin Li, Xiaofeng Chen, Xinyi Huang, Shaohua Tang and Yang Xiang Senior Member, IEEE and Mohammad Mehedi Hassan Member, IEEE and Abdulhameed Alelaiwi Member, IEEE,“Secure Distributed Deduplication Systems with Improved Reliability”, IEEE Transactions on Computers, 2015.