2
2
Review
A R T I C L E I N F O A B S T R A C T
Keywords: With the explosive growth of data and the rapid development of science technology, big data analysis has
Cloud computing attracted increasing attention. Due to the restrictive performance of traditional devices, cloud computing emerges
Privacy preserving as a convenient storage and computing platform for big data analysis. Driven by benefits, cloud servers may
Data integrity verification intentionally delete or modify outsourced big data. Therefore, users need to make sure that the servers correctly
Public auditing store the outsourced big data prior to deploying the cloud computing applications in practice. To resolve the issue,
many researchers have concentrated on enabling users to check the completeness of data with data integrity
verification (DIV) technique. We have therefore collated a summary of the existing literature, aiming to present a
solid and stimulating review of current academic achievements for interested readers. Firstly, we present a
fundamental introduction by defining seven major topics in order to offer a summary of the existing research
domain for DIV study. Secondly, we classify the state-of-the-art DIV solutions into four categories, and then we
parse each category based on dynamics, providing a clear and hierarchical classification of forthcoming DIV ef-
forts. Thirdly, we discuss the principal topics and technical means utilized to equip DIV schemes with different
requirements. Finally, we discuss the issues and challenges anticipated in future work, thus suggesting possible
directions for follow-up research.
1. Introduction 2011; Armbrust et al., 2010). This reduces the data maintenance and
management costs for the client, which explains the enormous popularity
With the rapid development of science and technology, big data has of cloud storage. However, outsourcing big data into the cloud results in
brought many conveniences to people's life (Yu et al., 2017a; Yu, 2016). the separation of data ownership and management (Khan, 2016). Users
However, with the explosion in the volume of data, a traditional calcu- cannot always trust a CSS, because it may abuse data management rights
lation model cannot satisfy the updating and sharing requirements due to and make data vulnerable to security threats (Huang et al., 2017a; Li
its restricted storage resources and computing power. Cloud computing et al., 2017a). Therefore, many problems must be resolved prior to
(Nachiappan et al., 2017) has been regarded as an attractive computation outsourcing big data into a CSS (Fu et al., 2018a; Zhu et al., 2018). For
paradigm, for it enables users to delegate big data processing tasks to a example, how do users fully trust a CSS and make sure that the out-
Cloud Storage Server (CSS). With the popularity of cloud computing, sourced big data is complete? How can they prevent attackers or mali-
users are more and more willing to move their data from traditional cious clouds from tampering with data without their permission? How
devices into the cloud (Yang et al., 2015; Hu et al., 2017; Cai et al., 2017). can users effectively audit cloud data without retrieving the entire data
Cloud storage is essentially a large data center (Aujla et al., 2018), set (Chen et al., 2018)? Is there an appropriate way to enable users to
providing users with on-demand pay storage services, with high flexi- update the stored data effectively? All of the above problems can be
bility, scalability, and tolerance features. Users can get the desired ser- addressed by data integrity verification technique (Fu et al., 2018b); that
vices on a pay-as-you-go model (Lu et al., 2014; Subashini and Kavitha, is, designing a mechanism that allows users to detect the integrity of the
☆
This work is supported by National Science Foundation of China (61572255, 61702266), Six Talent Peaks Project in Jiangsu Province of China (XYDXXJS-032), The Open Project
Program of the Guizhou Provincial Key Laboratory of Public Big Data (2017BDKFJJ031), CERNET Innovation Project (NGII20170205).
* Corresponding author. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China.
E-mail address: [email protected] (A. Fu).
https://doi.org/10.1016/j.jnca.2018.08.003
Received 26 April 2018; Received in revised form 12 July 2018; Accepted 14 August 2018
Available online 17 August 2018
1084-8045/© 2018 Elsevier Ltd. All rights reserved.
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
outsourced big data in the cloud environment. corresponding technical means. Section 5 presents some open problem
Traditional techniques for verifying data integrity, such as RSA and challenges for future research. Finally, our conclusion is given in
(Rivest et al., 1978) and MD5 (The5 Message-Digest Al, 1321), require Section 6.
users to firstly download their entire data set from the cloud, and then
compare the signatures or hash values of the downloaded data with those 2. Overview
kept in local. In reality, the outsourced data is usually massive, such as
medical records, financial reports, and scientific material, hence down- In this section, we illustrate the system model of DIV, describe the
loading all these files from the cloud is not only time-consuming, but also study content of DIV on seven topics, and then present our custom tax-
a heavy burden for limited-capability users. Of course, checking data onomy of the state-of-the-art DIV schemes.
integrity by using the hash tree resolves this defect, but it only verifies the
correctness of the hash tree corresponding to the storage data rather than 2.1. The system model
the data itself. In this case, a CSS may conceal the reality of storing false
or incomplete data for its own benefit. Therefore, it is essential to design The common system model defined in public auditing contains four
a mechanism that allows users to check the integrity of cloud storage data entities, and the system model of DIV incorporating a TPA is shown in
efficiently without having to retrieve the entire data. Fig. 1.
To solve these problems, many researchers have focused on enabling
public auditing for data stored in a CSS by employing data integrity ● Users: There are two types of users, data owners (DOs) and authorized
verification (DIV) technologies (Singh et al., 2016; Sookhak et al., 2014; users (AUs). A DO is an entity with large amounts of data that need to
Liu et al., 2013). DIV is actually an interactive protocol running between be stored in the cloud. In order to maintain the accuracy of the
users and CSSs (Liu et al., 2015a). The initial participation entities only uploaded data, a DO is able to perform dynamic updates of data in the
involve users and CSSs, which is called private auditing (Liu et al., 2016). CSS via insertion, modification, and deletion operations. AUs, by
In this model, users need to compute some authentication data locally contrast, are granted access to read the stored data after authorization
and interact with a CSS frequently during the entire data storage period, by a DO.
which confers on them a tremendous burden. To liberate users from this ● TPA: This is a third entity that can audit the integrity of cloud data
heavy computing task, an entity called a Third Party Auditor (TPA), after it is recognized by a DO. The purpose of its introduction is
which has more ample resources and stronger professional capabilities enabling users to maintain a low overhead while verifying data
than users, is introduced to interact with a CSS. The model in which the integrity.
auditor can be either the actual user or a TPA is called public auditing, ● CSS: An entity, which provides the users with sufficient storage space
and it allows users to delegate their auditing tasks to a TPA in order to and computational resources. It is required to response for challenges
enhance auditing efficiency and decrease the computational cost. As a from a TPA.
result, devising a proper DIV scheme for public auditing is deemed an
urgent need. DIV is a challenge-response protocol and its detailed procedure is as
DIV refers to a challenge/response protocol that provides assured follows. (1) A DO firstly re-processes her/his files, using e.g. encryption,
integrity for data managed by an untrustworthy CSS. It offers a proba- coding, or blocking, and then generates some metadata for all data. Then
bilistic guarantee for users' data by employing a random sampling she/he uploads the data and the corresponding metadata to CSS. When
strategy, which means the auditor only needs to verify the evidence of checking the integrity of the uploaded data, she/he sends a request to the
requested data blocks returned by the CSS rather than challenging all the TPA and waits for the results notification. (2) On receiving the request,
blocks. The first classic DIV scheme for untrusted servers was proposed in the TPA generates a random challenge and sends it to a CSS. Of course,
2008 (Ateniese et al., 2007); since then, many DIV solutions (Singh et al., we often assume that the TPA is authorized by a DO prior to its inter-
2011; Zhu et al., 2010; Wang et al., 2011, 2016) have been proposed for action with the CSS. (3) After receiving the challenge message, the CSS
auditing outsourced data. generates a corresponding proof related to the challenge and returns it to
In this paper, we aim to review the state-of-the-art DIV efforts that the TPA. (4) To check the integrity of the data, the TPA verifies the
ensure the integrity of data stored in CSS. First, we study the research
topics of DIV through seven topics, namely audit model, user mode, cloud
structure, storage type, dynamic pattern, security requirements, and
performance metrics, in order to provide a basic understanding of DIV
study. Next, we classify the existing DIV approaches into four types ac-
cording to the user mode and storage type, i.e., single user with single
copy, multiple users with single copy, single user with multiple copies AUs
and multiple users with multiple copies. Compared with the current
classification methods based on signature or functionality, our custom
taxonomy approach performs better in terms of classifying quality, for
each DIV method can quickly find its own category according to our
criteria. We also comment on the merits and drawbacks of some typical
schemes for each category. Furthermore, we raise some major problems DOs
that should be considered to enhance the safety and efficiency of DIV, CSS
and further discuss the existing technical methods that can address these
issues. Finally, we list some open problems and challenges to point out
some feasible paths for future researchers.
The rest of the paper is organized as follows. Section 2 provides the
relevant knowledge of DIV in cloud computing. It also explains the sys-
tem structure of the DIV model and its critical building requirements.
Section 3 classifies the state-of-the-art DIV approaches into four cate-
gories, and analyzes the merits and limitations of several typical refer- TPA
ences for each category. Section 4 focuses on some principal topics that
we have not been able to deepen in the literature analysis and the Fig. 1. The system model.
2
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
validity of the receiving proof. If the verification fails, the TPA returns a to store multiple copies of data across many servers, which is called
message of “failure” to the DO. Otherwise, the TPA returns a message of multiple copies storage (Curtmola et al., 2008).
“success” to the DO, which means the stored data is intact. There are two kinds of dynamic patterns to provide update service for
users. The first is the static pattern, which refers to the situation where
2.2. The study topics the updated files remain unchanged throughout the storage period. The
second is the dynamic pattern, which means that some or all classes of
To portray the research contents of existing DIV efforts, Fig. 2 pre- dynamic update operations may be achieved, such as modification,
sents a comprehensive description of DIV study through seven essential deletion, and insertion.
criteria, namely Audit Model, User Mode, Cloud Structure, Storage Type, The security requirements refer to a number of attributes that should
Dynamic Pattern, Security Requirements, and Performance Metrics. be taken into consideration when designing an appropriate DIV
The audit model contains private auditing and public auditing. Pri- approach, such as: (1) Privacy preserving ensures that data content will
vate auditing only allows users to check data integrity themselves. Public not be leaked to the TPA. (2) Batch auditing enables a TPA to perform
auditing enables a TPA to verify the integrity of stored data without parallel verification of multiple verification tasks, thus improving veri-
involving private information in the response checking. fication efficiency. (3) Availability protects data against all kinds of at-
The user mode contains two types, namely single user mode and tacks from inside and outside, such as forgery, replay, replace attacks
multiple users mode. Single user mode means there is only one user owns launched by a malicious CSS, and collusion attacks among servers. (4)
the uploaded data involved in the system model, and no other entity is Correctness ensures that a CSS cannot pass the verification if it fails to
allowed to access her/his data. In multiple users mode, we encounter two store multiple copies of the data as required. (5) Accuracy equips a DIV
different scenarios. (1) There are two types of users with different status: scheme with the skill to detect damage data with a high probability. (6)
DOs and AUs. In general, a DO uploads her/his files to a CSS with the Data recovery allows users to recover their data when the corrupted data
purpose of sharing her/his data with many AUs. It is administrated by a does not exceed a certain proportion. Before uploading data, users often
group manager (GM) who determines the entry of new members and the process data with error-correcting code (Hamming, 1950) or network
revocation of misbehaving members, which means a GM has greater code (Jaggi et al., 2005) to achieve data recovery.
power than other persons. There are two ways to share data in cloud The performance metrics include three important evaluation criteria,
storage. One is a one-to-many pattern, which means that many AUs gain such as storage, computation and communication costs. DIV approaches
access to the stored data of one DO in an authorized manner. Another is devote much effort to minimizing the storage, computation and
many-to-many pattern, which means that many DOs share their data with communication overhead for users, and the cost of a TPA and CSS also
numbers of AUs simultaneously in the form of a group. (2) There are need to be reduced as much as possible in developing a high-performance
many users of the same status who manage the stored data together in a DIV construction.
group. They have to negotiate changes in membership, as they have
equal rights. More factors have to be addressed in multiple users mode 2.3. Custom taxonomy
than in single user mode, such as group dynamics (Huang et al., 2017b;
Jiang et al., 2016), anonymity (Huang et al., 2018) and traceability (Shen The above section offers a detailed description of DIV study content.
et al., 2017a). However, not all of these seven characteristics will be selected, as the
The cloud structure used in a DIV scheme can be either for a single classification criteria for a DIV scheme usually covers more than one
cloud server or multiple cloud servers. If all the data is stored on the same topic. Unlike the existing classification according to adopted signatures
server, there is a single cloud server structure. Under some circum- or functionality, we firstly choose the user mode and the storage type to
stances, users are willing to distribute their files across multiple cloud divide the existing DIV schemes into four categories, and then each
servers due to the risk of service availability failure in a single cloud. A category is divided into two or three classes according to the dynamic
hybrid cloud is a special case of multiple cloud servers structure that pattern (with the exception of multiple users with a single copy, which is
emphasizes the existence of both private and open clouds (Singh et al., classified by the GM). Such hierarchical progressive research delineates a
2011), thus we do not describe the hybrid cloud (Zhu et al., 2010) comprehensive and clear DIV study. For each class, we collect several
independently in this paper. classic texts, and point out the advantages and drawbacks according to
With single copy storage, users only store one copy of their files in our evaluation criterion listed in Fig. 3. All criterion elements can be
CSSs. To improve the availability and durability of data, users may need attributed to three aspects, namely functionality, security and efficiency,
3
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
illustrated in Fig. 3. Here, we first introduce some common elements for duplication of efforts. For instance, each member belonging to the same
designing an appropriate DIV; others that are only applicable to a research group is willing to upload and share their results with others.
particular DIV scheme will be described in the next section. To design a
proper DIV, the following fundamental criteria must be taken into ac- 3. The taxonomy of the state-of-the-art schemes
count: (1) Public auditing. Any TPA can remotely verify the integrity of
the stored data without having to obtain private information from users. In this section, we classify the existent DIV approaches into four types,
In the process of checking the data integrity, users only need to send the as shown in Fig. 4. For each category, several typical schemes are
requests to the TPA, and then they can receive the audit results from the reviewed along with their peculiarities deriving from functionality, se-
TPA. (2) Frequency. Auditors are able to launch numerous tests by curity and efficiency, which are the core objectives of DIV schemes and
generating a random challenge for each verification. (3) Batch auditing. can be used to portray any one DIV solution. To highlight the similarities
A TPA may receive numbers of requests from one or multiple users and differences among these schemes, we tabulate the comparison results
simultaneously. Batch auditing allows a TPA to implement multiple re- of performance and functionality in Tables 1–4.
quirements simultaneously, thus effectively improving efficiency. (4)
Data privacy. A TPA cannot obtain any knowledge about users' data
3.1. Single user with single copy
during the entire checking procedure. (5) Data dynamics. Users may
perform data dynamic operations under certain applications, including
A DO who is willing to upload the data into the cloud does not allow
modification, insertion and deletion; hence the files stored in CSS should
any other party to access the stored data in the CSS. In this case, the
be updated to be consistent with users' operations. (6) Data sharing. In
existing DIV approaches (Ateniese et al., 2007, 2008; Wang, 2013; Wang
some cases, one or more data owners are likely to share their data with
et al., 2011, 2014a, 2016; Erway et al., 2009; Zhu et al., 2013; Yang and
other users with the intention of making full use of resources or reducing
Jia, 2013; Liu et al., 2014; Chen et al., 2014a; Sookhak et al., 2018)
4
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
Table 1
Performance comparison of auditing schemes for single user with single copy.
Scheme Cloud structure Dynamic pattern Dynamic technique Communication overhead Computation costs
User/TPA CSS
Note: n is the blocks number of a file; a block is divided into s sectors; c is the number of challenged blocks; m is the whole number of files in the CSS; and “-” means “no
demand.”
Table 2 tasks to a proxy. In 2013, Wang, (2013) proposed a proxy PDP. The
Functionality comparison of auditing schemes for single user with single copy. checker can be either a client (data owner) or a proxy. In some specific
Scheme Public Data Data Bath Expanding to situations, dynamic updates are not needed for some data, like archive
auditing privacy dynamics auditing multiple users files, medical records, historical data, etc. To achieve access to medical
Ateniese et al. Yes No No No No data across hospitals, Wang et al. (2014a) proposed a remote data
(2007) retrieval solution with privacy protection for electronic health networks.
Wang (2013) No No No Yes Null Throughout the auditing process, medical records are fixed once uploa-
Wang et al. Yes No No Yes Yes ded, and users do not need to update the records on the cloud. Another
(2014a)
similar scenario without requiring dynamics was raised by Wang et al.
Wang et al. Yes Yes No No No
(2016) (2016). In order to encourage individual reporting and disclosure of
Erway et al. No - Yes No No criminal facts, the judiciary will lease space on the cloud for users to
(Ateniese submit evidence and establish incentives for this purpose. Based on the
et al., 2008)
bilinear pairings and identity-based public key cryptography, a incentive
Wang et al. Yes Yes Yes Yes Yes
(Erway
and unconditionally anonymous identity-based public auditing protocol
et al., 2009) is designed. Once the evidence is submitted, the user does not need to
Zhu et al. Yes Yes Yes Null No update and maintain it. Therefore, the data block number directly par-
(Wang ticipates in the label calculation.
et al., 2011)
Dynamic. To support data dynamics, Ateniese et al. (2008) imple-
Zhu et al. Yes Yes No Yes No
(2013) mented another data integrity auditing protocol with symmetric
Yang and Jia Yes Yes Yes Yes Yes encryption technique based on their previous work. However, although
(2013) the paper provides some dynamic operations, it cannot support data
Liu et al. Yes No Yes Yes Yes
insertion operations.
(2014)
Chen et al. Yes Yes Yes Yes Yes
Erway et al. (2009) extended the model of PDP, and first presented a
(2014a) fully dynamic PDP solution, called DPDP. Rank-based Authenticated Skip
Sookhak et al. Yes Yes No Yes Yes List (RASL) is utilized for supporting provable updates to stored data.
(2018) Before each validation, the tags of queried or updated blocks must be
Note: “-” means “no demand”; and “Null” means “no mention.” verified with RASL, because the index information is removed from the
tag computation in Ateniese's PDP model. Hence, the execution efficiency
of this scheme remains in question. Later, Wang et al. (2011) designed
designed for a single user with a single copy seek to pursue full dynamic another dynamic PDP approach. Merkle Hash Tree (MHT) is included to
services and improve the update efficiency while implementing the provide data updates. However, the mechanism incurs large computation
secure audit of cloud storage data. The specific review of the literature is overhead on the TPA and heavy communication costs during the
as follows. updating and verification processes, and the lack of verifying data block
Static. Ateniese et al. (2007) first proposed the notion of provable location makes the scheme vulnerable to replace attacks launched by a
data possession (PDP), which allows a TPA to check the integrity of data malicious CSS. Further, Zhu et al. (2013) developed a public auditing
at untrusted stores. This method makes use of the homomorphic verifi- realization of full data dynamics. The computation and communication
cation tag (HVT) based on RSA so that the generated tags can be aggre- costs are greatly reduced by storing Index-Hash Table (IHT) in the TPA.
gated into a single value, which greatly reduces storage and IHT is efficient for modifying, but not for inserting and deleting. This is
communication overhead during the auditing process. Random sampling because these two operations lead to the readjustment of table structure
strategy is also adopted to improve audit efficiency. However, this and thus result in partially recalculating the tags of the affected data
scheme is only suitable for verifying static (or append-only) files because block. After this research, almost all implementation schemes of dynamic
inserting or deleting a new data block will result in regenerating the PDP have been built on RASL, MHT, IHT and their variants (Yang and Jia,
corresponding validation metadata of the other data blocks located after 2013), and all their efforts pursue the security and efficiency of full dy-
the operated block, and thus incurring too much computation cost. namic updates.
Under certain circumstances, such as in prisons or in remote locations While some above work (Erway et al., 2009; Wang et al., 2011; Zhu
without network coverage, users do not have the ability to audit their et al., 2013; Yang and Jia, 2013) has been able to support full dynamic
own cloud data,. Therefore, users need to hand over data integrity audit updates on data blocks, they only support updates with fixed-size blocks
5
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
Table 3
Functionality comparison of auditing schemes for multiple users with single copy.
Scheme Signature technique Data Data Identity Identity Member Large No sharing
dynamics traceability privacy traceability dynamics groups keys
Note: “No sharing keys” means that the specified scheme does not require all group members to share any secrets.
Table 4
Functionality comparison of auditing schemes for multiple copies storage.
Scheme Owner Public Authorized Data Transparency User Locating the destroyed Verifying all copies
mode auditing auditing dynamics revocation copies quickly simultaneously
as operating unit. As such, each small update results in a recalculation auditors. Based on the algebraic nature of outsourced documents, Soo-
and update of the entire file block, resulting in higher storage and khak et al. (2018) proposed an effective DIV method, which has the
communication overhead. A suitable solution was proposed by Liu et al. lowest computational and communication costs. The main contribution
(2014) to enable efficient verifiable fine-grained updates of big data in of this paper is to propose a new data structure called Divide and Conquer
cloud storage. The program supports fine-grained updates by improving Table, which is proficient in supporting dynamic data of common file
the classic MHT. In addition, an authorization process between the sizes. In addition, the data structure allows the approach to be applied to
auditor and the cloud server is introduced to check the correctness of the large-scale data storage with minimal computational cost. Experimental
challenge, thereby preventing malicious attackers from deliberately results show that the scheme is superior to the existing methods in terms
challenging the cloud server. Another similar scheme to support of computational and communication costs.
fine-grained update of big data was presented by Chen et al. (2014a). Comparison: Table 1 depicts the performance comparison of audit-
Different from the scheme in ref (Liu et al., 2014), the solution employs a ing schemes for single user with single copy, mainly on two different
technology called balanced update tree to support fine-grained updates. criteria: communication overhead and computation costs. For a DIV
The nature of the balanced update tree eliminates the update validation model, the communication overhead refers to the size of the random
after each dynamic update operation, reducing computational and challenge and the audit evidence, and the computation costs mainly
communication resources. In order to achieve public verification and involve the overhead of verifying the evidence on the user/TPA side and
privacy protection, the scheme adopts RSA-based homomorphic tags in the cost of the evidence generation at the CSS side. Comparing three
the proposed construction. There is no need to send a combination of techniques for data dynamics, we conclude that IHT has the smallest cost
data blocks during the verification process, which can prevent the compared to RASL and MHT. Additionally, it is clear that static and semi-
auditor from gaining any knowledge of the outsourced big data. dynamic DIV schemes cost less than dynamic ones. Table 2 presents the
While some methods (Liu et al., 2014; Chen et al., 2014a) has been functionality comparison of auditing schemes for single user with single
designed to support dynamic updates over big data, frequent update copy in terms of five perspectives: public auditing, data privacy, data
operations still imposes high computational and communication costs on dynamics, bath auditing, and expanding to multiple users settings. As
6
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
seen from the table, some functions are omitted when meeting other To solve this problem, Wang et al. (2012b) proposed Knox, a novel
functional requirements, as public auditing is not satisfied in Erway et al. privacy-preserving approach for shared data with large groups. The
(2009). A good solution is often to try to meet multifunctional re- group signature is modified for designing homomorphic authenticators,
quirements, like Zhu et al. (2013). so the signer's identity is not exposed to the TPA. The amount of verifi-
Discussion: As for devising a DIV approach for single user with single cation information and the auditing time is independent from the number
copy, researchers have focused on improving efficiency and enriching of users in the group. Identity privacy and traceability are achieved in
functionality, and the core objective they have pursued is achieving full this scheme, but member dynamics are not covered.
dynamic data efficiently. Previous literature shows that RASL, MHT, and Given that outsourced databases involve a large number of multi-user
IHT are major dynamic technologies whose suitability can be improved modifications, existing solutions are not practical from an efficiency
for various scenarios. Although IHT costs the least when employed for perspective. Taking this problem into consideration, Song et al. (2017)
dynamics, MHT is more popular with scholars for it can provide strict constructed a efficient signature scheme, characterized by additive ho-
verification and its simple structures are easily adapted for various sce- momorphism. In addition, they have proposed a new and practical
narios. Besides, the new structures for dynamics, such as balanced update mechanism for verifying the correctness and integrity of multi-user
tree and Divide and Conquer Table, are also put forward for updating the modifications, not requiring the data owner always online. Neverthe-
outsourced big data. less, the communication costs are not ideal because the tag in previous
proposals contains only one element and that of the scheme contains
3.2. Multiple users with single copy three elements.
To eliminate the disclosure of existing users' secrets caused by
With data sharing services supported by a CSS, DOs can easily share collusion attacks, Yuan and Yu (2014) designed a data integrity auditing
data with other members as a group. The DIV solutions for multiple users mechanism characterized by collusion resistance and multi-user modi-
with single copy can be divided into two categories namely shared data fication. The scheme improves algebraic signature and proxy signature
with a GM and shared data without a GM. (1) For shared data with a GM, techniques to provide effective user revocation without the assumption
a manager is generally considered to be the DO of shared data, while that no collusion occurs between the CSS and revoked users. Compared
other common members are AUs. Thus a manager can revoke members with schemes (Wang et al., 2012a; Wang et al., 2012b; Song et al., 2017;
who are guilty of misconduct or approve newcomers to join, and can even Yuan and Yu, 2014; Wu et al., 2016; Yang et al., 2016; Fu et al., 2017;
disclose the identity of a designated member in necessary situations. As Zhang et al., 2018; Zuo et al., 2018; Shen et al., 2018; Blaze et al., 1998;
we introduced in section 2.2, there are two ways to share data in cloud Brickell et al., 2004; Barsoum and Hasan, 2010; Barsoum and Hasan,
storage: one-to-many pattern and many-to-many pattern. When there is 2011; Etemad and Küpçü, 2013; Liu et al., 2015b; Zhang et al., 2016; Li
only one administrator in the group, it is necessary to consider the abuse et al., 2017b; Peng et al., 2017; Chen et al., 2014b; Zha et al., 2015;
of concentrated rights, e.g. the possibility of framing members. When Abo-alian et al., 2017; Shamir, 1984; Al-Riyami and Paterson, 2003;
multiple administrators exist in a group, they should cancel existing Wang et al., 2014b; Wang, 2015; Yu et al., 2017b; Wang et al., 2013b; He
members or introduce new persons together in a negotiated way. (2) For et al., 2015; He et al., 2017; Tian et al., 2015; Shen et al., 2017b; Wang
shared data without a GM, each member holds the same right, and et al., 2017a), the computation overhead on the user side is constant
membership changes need to be negotiated by all persons in the group rather than affected by the number of users. Nevertheless, the CSS may
because they share the same status. modify the signatures intentionally when updating tags of data blocks
In order to support user revocation effectively, Wang et al. (2013a) updated last by revoked users.
developed Panda, a novel integrity scheme for shared data. By employing By resorting to ring signature or group signature to satisfy identity
proxy re-signature technology, a semi-trusted CSS is allowed to transfer anonymity, the authentication tag on a block contains numbers of ele-
the signature generated by a revoked user to another signature under the ments, which cause a greater increase on storage and communication
target member's private key. In this scheme, a CSS cannot replace the costs during the verifying process. Wu et al. (2016) devised a
existing members to produce a valid signature on any data block, and a privacy-preserving cloud auditing approach for multiple uploaders, in
revoked user can no longer generate valid signatures for shared data, which the tag of each block involves only one element regardless of the
which ensures that the completeness of shared data can still be checked number of uploaders. A TPA is able to audit shared data without learning
only with the existing users' public keys. Data dynamics are also achieved any information about a uploader's identity in this mechanism.
by leveraging index hash tables, and a unique identifier is computed for Another solution to achieve the goal of supporting identity privacy
each block to enable a user to update a block efficiently without altering and traceability for group members simultaneously was put forward by
the identifiers of other blocks. Although Panda outclasses previous work Yang et al. (2016). There are two tables maintained by each GM. One
on user revocation, it cannot resist a collusion attack between invalid records the identities and keys of existing users for effective member
users and the CSS. Moreover, the number of inserted blocks between two management, and the other records the identifier of a data block and the
specified data blocks is restricted due to the characteristics of the IHT identity of the user who performs the latest modification on the block to
design. attain identity traceability. Blind signature technique is employed to
To achieve the privacy of user identity, Wang et al. (2012a) presented protect data privacy. However, data dynamics are taken into account and
the first privacy-preserving public auditing scheme for shared data, data privacy is only imposed on a GM where managers are not the owners
called Oruta. This scheme provides public auditing for data-sharing of the data.
group members without a manager. Ring signature and HVT are com- The above scheme with GM only considers the case where there is just
bined to aggregate the signatures of the challenged blocks, which realizes one manager in the group, but in practice, some people may create a file
the purpose of keeping the signer's identity private from the TPA. Data at the same time, and they can share the data with other authorized users.
dynamics are also realized by leveraging index hash tables, and it As the original owners of the data, they manage the data and change the
removes the signer identity from the block identifier in Panda (Wang membership authorization together. Fu et al. (2017) was the first to
et al., 2013a). However, the members of a group are always fixed; group present a public auditing solution for shared data with multiple man-
dynamics are not considered in this scheme. All the authentication in- agers. A homomorphic verifiable group signature is constructed by
formation needs to be recalculated when a new user joins the group, and combining group signature with (t, s) secure sharing technique, which
the signer's identity is unconditionally protected, which impedes the achieves multi-level privacy-preservation abilities, including identity
identity traceability of any misbehaving users. Panda cannot be extended privacy, traceability and non-frameability. A revocation option not only
to a large group for the size of the verification information increases for managers but also for ordinary members is also offered by the
linearly with the number of users. mechanism. Nevertheless, a new authorized key pair must be
7
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
regenerated and distributed to all members once a newcomer joins or a coming from the proxy rather than the group members. However, it is not
member leaves. Data updates remain an open problem in this scheme. feasible to find a proxy that can be trusted by all users in reality, and
All above proposals with supporting user revocation suffer low effi- shared data will suffer a massive threat once the proxy fails or is seduced
ciency because the computational cost introduced by user revocation is by other interests. 3) Anonymous device authentication (Brickell et al.,
linear with the number of the blocks operated by the revoked user. To 2004) offers users a possible alternative to achieve the goal of identity
narrow this gap, Zhang et al. (2018) raised a efficient user revocation privacy. But users are required to move all data services to a trusted
solution for auditing shared big data by employing identity-based cryp- environment, and thus data migration will bring extra overhead. 4)
tography, which eliminates the certificate management in Public Key Utilizing group signature or ring signature is also a proper solution to
Infrastructure (PKI) systems. To realize user revocation, the scheme up- realize the anonymity of group members. However, the efficiency of
dates the non-revoked group users' private keys rather than updates the these schemes needs to be improved.
authenticators possessed by the revoked user. However, the scheme as-
sumes that both the public and private key of the user in the group are the 3.3. Single user with multiple copies
same, which limits its application.
In the existing DIV scheme, the key is used to generate the signature Note that the term“user” with multiple copies refers to the DO who
of the data block. Once the key is leaked, the attacker can forge the sends his/her files into the CSS for replicated storing, which differs from
signature and pass the verification. To fill in this gap, Zuo et al. (2018) single copy storage. The reason is that most existing multiple replication
presented a new data protection mechanism for cloud storage, which schemes allow authorized users to access storage data, and thus should be
achieves privacy preserving by the following two ways: (1) The key is classified by the number of owners rather than that of users. In other
divided into two parts, and the security of the key is maintained only words, although there may exist several users (a DO and some AUs), a
when one of these two factors works. (2) Once a key is exposed or expires, DIV scheme for replicated data is called “single user” if there is only one
the user is willing to update the key by smashing the encryption and key DO.
separation techniques. Generally speaking, cloud storage files often suffer damage due to
In addition, Shen et al. (2018) proposed a data sharing scheme based hardware failures or human errors. Multiple-replica storage is an effec-
on identity-based cryptography, which simplifies complex certificate tive way to improve the reliability and availability of data. A damaged
management. In this scheme, a disinfectant is introduced to sterilize data copy can be fully restored if one copy stored in servers remains intact.
blocks corresponding to sensitive information of the file, and the tags of However, servers may store only one or two copies of the original data,
these data blocks are converted into valid data of the disinfection file. while claiming that they are storing the number of copies specified by the
Therefore, the solution enables files stored in the cloud to be shared and user. Therefore, users need to verify that severs actually store the spec-
used by others while sensitive information is hidden, while remote data ified number of copies. Single replica schemes can be extended to
integrity auditing can still be performed efficiently. multiple-replica scenarios, but the auditing costs would increase expo-
Comparison: Great efforts have been made to design a public nentially with the number of copies, thus limiting their application. Many
auditing scheme for shared data, and Table 3 offers comparisons among cryptography schemes (Curtmola et al., 2008; Barsoum and Hasan, 2010,
some existing DIV schemes (Wang et al., 2012a, 2012b, 2013a; Song 2011; Etemad and Küpçü, 2013; Liu et al., 2015b; Zhang et al., 2016)
et al., 2017; Yuan and Yu, 2014; Wu et al., 2016; Yang et al., 2016; Fu have been proposed to address this problem.
et al., 2017; Zhang et al., 2018; Zuo et al., 2018; Shen et al., 2018) in Curtmola et al. (2008) presented a multiple-replica auditing scheme
terms of signature technique, data dynamics, data traceability, identity based on RSA signature, which is the first attempt to create multiple
privacy, identity traceability, member dynamics, large groups, and no copies and check them. In this paper, multiple copies of a file are
sharing keys. generated by first encrypting and then randomizing them to achieve the
Discussion: Compared with the single user mode, two important distinguishability of the replicas along with data privacy. Unfortunately,
requirements demand to be considered in the multiple users mode: (1) data dynamics cannot be supported since the block number is involved in
Member dynamics, especially user revocation. For security reasons, a computing verification tags for data. What's worse, the verifier needs to
user cannot access or even modify shared data anymore after being share a secret key with the original data owner, which limits the scheme
revoked from the group because of misbehavior or active departure, and in application.
the signatures generated by the revoked user are no longer valid to the Barsoum and Hasan (2010) developed two solutions for checking the
group. As a result, the blocks signed by the revoked user should be re- integrity of multiple copies for a data owner, comprising deterministic
signed by an existing user of the group, so that the existing members approach and probabilistic method. The latter is considered more prac-
can still verify the integrity of all data blocks. Generally, a member of the tical, as the number of the challenged blocks is fixed, with high detection
group needs to download, re-sign and upload the blocks previously probability for corrupted data. This is a more economical alternative for a
signed by the revoked user, which cannot be regarded as the best alter- user than challenging all blocks in a deterministic selection. However,
native since extra computation and communication costs have been the scheme is only designed for static files.
imposed on this user. Proxy re-signature (PRS) (Blaze et al., 1998), which To solve this issue, Barsoum and Hasan (2011) further proposed two
transfers signing rights on one message into a semi-trusted proxy without dynamic plans, one of which is based on an improved MHT, while the
allowing the proxy to sign any other messages, can be considered an other is designed on a map-version table. The table indicates the mapping
optimal candidate to resolve the above issue. (2) Identity privacy. By between the physical position and the logical number of a block, which is
looking at the signature of a data block, an entity can know that the much superior to MHT with regard to computing overhead. Both sce-
signature is generated by one member in the group, but cannot identify narios enable full dynamic operations on the outsourced data. Besides,
the specific member. A signer's identity should be protected from the TPA the proposal employs a recursive divide-and-conquer approach,
because it may indicate that one user in the group is more valuable than attempting to allow the verifier to identify the number of damaged
others. There are usually three ways to implement identity privacy for copies. Nevertheless, the challenge/response mechanism is transparent
group users. 1) All group users share a global private key, and each user to the DO, and she/he must know that the new structure is already in
uses the private key to sign the data block. An unavoidable drawback of place.
this method is that a new key needs to be generated and shared to all Etemad and Küpçü (2013) considered a transparent, distributed, dy-
users once a member joins or leaves the group, which brings great namic PDP for replicated data, which was the first attempt to help a user
overhead to the management and distribution of the key. 2) A trusted get rid of the pre-computation imposed by the cloud structure, and then
proxy is introduced between users and the CSS. All shared data is signed check the data integrity toward an appointed architecture. CSP can
and uploaded by the proxy, so the CSS will verify and know the data is manage resources flexibly and execute its own load-balancing and
8
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
replication scheme in the background, while providing provable storage and users are required to perform many recalculations to fit the cloud
for customers. This makes it easier for deployment in practice. structure, which is a burden on users with limited resources. Therefore,
For offering data dynamics, Liu et al. (2015b) presented a the cloud structure should be transparent from users' point of view.
multi-replica public auditing scheme named MuR-DPA, which incorpo-
rated a novel authenticated data structure based on MHT. One dis-
tinguishing feature of the devised data structure is that level values of 3.4. Multiple users with multiple copies
nodes in MR-MHT are assigned in a top-down order. But MuR-DPA incurs
much communication overhead for data integrity verification because it Chen et al. (2014b) invented a data integrity scheme for replica data
verifies replicas one by one. The following year, Zhang et al. (2016) in a multiple users mode. A TPA is introduced based on disclosure of
designed a multi-copy dynamic scheme by using an improved MHT. The users' privacy, and batch auditing is supported to deal with query re-
scheme can support variable length data block. Most recent studies on quirements from multiple users, which reduces the number of in-
multiple-replica provable data possession use a homomorphic linear teractions and the communication overhead. This proposed scheme can
authenticator to generate the aggregated tags for blocks at the same resist replay attacks, conspiracy attacks, and sand substitution attacks
index in each copy, but it cannot verify a single copy to identify a damage launched by a malicious CSS. Unfortunately, it fails to support data dy-
replica. Motivated by this, Li et al. (2017b) devised a flexible namics. Zha et al. (2015) developed a proposal for multiple-replica DIV.
multiple-replica data auditing proposal in cloud storage. It computes tags The multi-branch authentication tree is manipulated to achieve updating
based on vector dot products rather than expensive exponentiations, and data updating dynamically. The batch auditing for auditing multiple
generates only one tag for all blocks of the same index in each copy. The replicas of many users' data simultaneously is also realized with the help
biggest highlight of this program is the support of flexible data possession of the TPA. Moreover, the data privacy is ensured in this paper to prevent
verification, that is, the auditor can choose to verify the arbitrary number a TPA from getting the content. Abo-alian et al. (2017) put forward
of copies for one validation. another fully dynamic public auditing scheme based on RASL. Two
All of the above mentioned multiple copy schemes are based on the conspicuous ideas in this paper are applying multi-owner cloud storage
traditional name of the PKI system. The issuance and revocation of cer- and supporting variable size data blocks. In addition, revoked users'
tificates makes these programs more expensive. Recently, Peng et al. signatures can be transferred by the CSS with the proxy re-signature key.
(2017) put forward a novel identity-based public multi-replica provable However, there are some drawbacks such as the large auxiliary infor-
data possession scheme. The scheme generates a label for each copy mation needed for updating.
instead of each block, so block-level data dynamic operations are not Comparison: In contrast to schemes for single user with multiple
possible. copies, those for multiple users should consider member dynamics, but
Comparison: Note that there are a few DIV schemes for multiple existing solutions focus on data dynamics and ignore this issue. Actually,
users with multiple copies. For convenience, we put them and those of most solutions for the former can be extended to the latter scenarios, and
single user with multiple copies together in the table form, but distin- there is no new question of value arising in the extension process, so it
guish them by the user mode to depict the functionality of the afore- seems unnecessary to devise a suitable DIV for multiple users with
mentioned schemes. Table 4 describes the comparison results from eight multiple copies.
different perspectives. As shown in the table, we can come to the
conclusion that multiple-replica schemes are still lacking in functionality,
especially as only a few solutions have been presented for the multiple 3.5. Discussion
owners model.
Discussion: As for devising a DIV approach for single user with Through a comprehensive survey of DIV schemes, we reach the
multiple copies, the priority is to solve the following considerations: (1) conclusion that most efforts have been made to devise a proper solution
Efficiency. The cost of auditing t copies should be far less than that of a from the perspective of a user rather than that of a CSS. The requirements
single copy scheme running t times. In other words, auditing overhead of the three objectives of the user are taken into consideration, e.g. data,
should be independent of the number of copies, except in particular identity and key, as shown in Fig. 5. Data is the most important object in
scenarios; for instance, one may check replicas one by one to find the the entire audit process, and dynamics, privacy and traceability should be
corrupted copy. In general, HVT is regarded as an efficient way to satisfied in order to tally with reality. Dynamics guarantee that users can
aggregate the signatures of data blocks, which enables a TPA to verify the update cloud data at any time, privacy ensures that user data cannot be
requested blocks of all replicas. (2) Data dynamics. Verifiable data obtained by unauthorized entities, and traceability allows users to track
structures designed for single copy auditing are usually modified to offer changes in data. In regard to user identifiers, the identity of a user should
data updates, especially MHT. Supporting the verification of block be kept confidential, member dynamics allow new members to join and
indices and improving the efficiency of data dynamics are two funda- old members to withdraw, and identity traceability enables reward and
mental purposes when designing a verifiable dynamic structure for a punishment mechanisms to be implemented. As for key, the expiration of
multiple-replica construction. (3) Locating the destroyed copies quickly. the certificate requires the users to actively update their keys, and key-
While the verification of all copies simultaneously can greatly improve exposure resistance must be satisfied when a key is leaked due to hard-
audit efficiency, once the verification fails, the verifier needs to verify all ware vulnerabilities or improper behavior by members or software.
copies one by one to find the damaged copy. Therefore, a fast method for
identifying damaged copies is worthwhile area of study. (4) Single user/
Multiple users. In a multiple-replica auditing scheme for a single user,
there is only one data owner who outsources data to the cloud. In
contrast, copies are created when multiple users upload their files into
the cloud and audit them. The aggregation of authenticators and evi-
dence during the audit process can be more complex than that in a single-
user setting. Another obvious difference is that the member dynamics
need to be considered in multiple users mode. (5) Transparency. In most
multiple-replica schemes, users need to take the cloud structure into
account when preprocessing files, such as how many servers are used for
storing files and which server stores data. Nevertheless, a CSS often ad-
justs its structure according to the progress of science and technology, Fig. 5. The requirements of three objects in DIV research.
9
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
4. Principal topics and technical means generated a tag for a block in the same way σ ij ¼
ðh umij Þðwij Þd modN, where mij represents the j-th block of the i-th
In this section, we review the principal topics and the corresponding replica. Some recent improvements based on RSA only differ the
technical means that must be taken into consideration to implement a construction of h(), like hðFilenameÞ (Zhang et al., 2016) and
proper DIV construction from different perspectives, including signature hðFilenameknkikjkgÞ (Abo-alian et al., 2017).
construction for public auditing, verifiable structures for data dynamics, (2) BLS-based homomorphic methods: BLS has a shorter signature
access control mechanisms for shared data, resistance methods for key length than the RSA signature at the same level of security,
exposure and blockchain technique for cloud data auditing, as shown in thereby more auditing schemes are adopted BLS-based homo-
Fig. 6. morphic methods. Wang et al. (2011) raised the first public
auditability scheme with enabling data dynamics based on BLS. In
4.1. Signature construction for public auditing the construction, let G; GT be two multiplicative cycle groups of
order p, e : G G → GT be a bilinear map, h : f0; 1g* → G be
HVT is the construction cornerstone of the existing DIV schemes cryptographic hash function that maps strings uniformly to G, and
because it can aggregate the tags of multiple data blocks into a single g be a generator of G. The client selects x 2 Zp as its private key
value, saving a lot of communication overhead. In order to improve audit and computes y ¼ g x 2 G as its public key. Meanwhile a random
efficiency, using signature technology combined with HVT to construct a element u 2 G is selected. For each block mi , the user generates a
homomorphic signature is a common method of auditing the outsourced tag as σ i ¼ ðh umi Þðmi Þx . The elimination of index i enables the
big data in the current DIV schemes. According to the investigation of proposal to support data dynamics. Some other achievements
existing DIV literature, almost all auditing cloud storage data solutions based on BLS only differ the construction of h(), like hðwi Þ (Wang,
adopt the following four construction methodologies, namely RSA-based 2013; Wang et al., 2014a), hðidÞ (Wang et al., 2013a), hðiÞ (Wu
homomorphic methods, BLS-based homomorphic methods, identity- et al., 2016), and hðVi kTi Þ (Shen et al., 2017b). To optimize the
based (ID-based) (Shamir, 1984) homomorphic methods and audit performance, a general fragment structure is introduced for
certificateless-based (CL-based) (Al-Riyami and Paterson, 2003) homo- auditing cloud data (Yang and Jia, 2013). For a file F, it is first
morphic methods. divided into n blocks. Then each block is further divided into s
sectors. The user still generates a tag for each block by computing
Q m
(1) RSA-based homomorphic methods: For public auditing, Ateni- σ i ¼ ðh sj¼1 uj ij Þðx; FidkiÞx , where Fid is the data identifier, mij
ese et al. (2007) proposed a sampling provable data possession represents the j-th sector of the i-th block in the file and ðu1 ; u2 ;
scheme which combines the RSA signature with HVT, and the ; us Þ are randomly chosen by the user. Another similar solution is
subsequent RSA-based solutions are mostly improved on this so- Q m
constructed as σ i ¼ ðh sj¼1 uj ij Þðmi Þx (Liu et al., 2014). With such
lution. In the proposal, some necessary parameters are generated
signature construction method expanded to multiple replicas
as same as that in RSA signature, the public key is computed as {N,
setting, Liu et al. (2015b) designed a multi-copy scheme for
g} and the private key is computed as {e, d, v}. The user divides the
auditing dynamic big data storage on cloud by computing σ ij ¼ ðh
file F into n blocks F ¼ ðm1 ; m2 ; ; mn Þ, where mi 2 Ζ *q . For each
umij Þðmij Þx as a block tag, where mij represents the j-th block of the
block mi , the user generates a tag as σ i ¼ ðh umi Þðwi Þd modN, i-th replica. The adoption of fragment structure and the different
where h : f0; 1g* → QRN is cryptographic hash function that maps structure of h() leaded the emergence of other programs used to
strings uniformly to QRN , u is a generator of QRN , wi is generated audit replication data based on BLS signature, namely σ ij ¼ ðh
by connecting the index i to the secret value. Here wi is different umij ÞðFilenameknkmkuÞx (Barsoum and Hasan, 2010) and σ ij ¼
and unpredictable for each tag. Then the CSS stores the file and its Q m x
ðhðmij Þ sk¼1 uk ijk Þ (Barsoum and Hasan, 2011).
tags. Later, the client can verify that the server owns the file by
generating a random challenge on a randomly selected set of file (3) ID-based homomorphic methods: ID signature has no certificate
blocks. Using the queried blocks and their corresponding tags, the feature compared to RSA and BLS signatures. By exploiting ID
CSS generates the proof of possession. Therefore, the client is cryptography to design PDP solutions, Wang et al. (2014b) con-
convinced that the data is stored correctly without retrieving data verted an ID aggregate signature to a PDP scheme. A system model
blocks. For replication storage, Curtmola et al. (2008) also and security model are defined in this scheme. In the initial stage
10
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
of the system, the private key generator (PKG) selects x 2 Zp as its 4.2. Verifiable structures for data dynamics
private key and computes y ¼ g x 2 G as its public key. After
receiving ID from the user, the PKG computes R ¼ g r and σ ¼ r þ Cloud storage data is not immutable, and it is always updated ac-
xhðID; RÞ and sends the private key ðR; σ Þ to the user, where r 2 Zp cording to the commands of its owner, such as insert, delete, and modify
is selected randomly. For a block mi , the user generates a tag as operations. For instance, data uploaded for the first time may be
σ i ¼ ðh umi ÞðiÞσ . In fact, the signature design process is the same incomplete, with its owner hoping to complete it after uploading. On the
as the solution based on BLS (Wu et al., 2016), except that the other hand, data may become obsolete or useless over time, in which case
user's private key is generated by PKG. Subsequently, Wang its owner will want to delete it from the cloud, which is the inevitable
(2015) proposed another PDP protocol based on ID signature for result of using the pay-as-you-go model. In brief, data dynamics for
multi-public cloud storage data integrity. The private key gener- secure cloud storage are of great necessity, and many efforts have been
ation in this construction is the same as above (Wang et al., made in this area. In general, the following three structures are often used
2014b), but the signature structure is slightly different due to the to implement dynamic auditing of outsourced data.
adoption of partitioning technology and the collaboration among
Q
multiple servers. The user computes σ ij ¼ ðh sj¼1 uj ij ÞðNi ; CSli ; iÞσ
m (1) MHT: MHT can efficiently and safely prove whether a series of
as a tag for the tuple ðNi ; CSli ; iÞ can ensure the block different from elements are damaged or replaced. The nodes of the tree respec-
tively store the hash values of the corresponding data blocks. A
each other. However, the security of the scheme is not strong
MHT is formed by calculating the pairwise hash values from the
enough. In their security model, the challenged blocks are not
bottom up until a unique root hash value is obtained. The MHT is a
provided for TagGen queries; in other words, the tags of these
verifiable data structure that has been extensively researched and
blocks cannot be accessed by an attacker, which is inconsistent
later used to support dynamic data updates. The first utilization of
with the actual capability of a cloud server. Yu et al. (2017b) put
MHT in secure cloud auditing was proposed by Wang et al. (2011).
forward an ID public auditing implementation with perfect pri-
However, without proper block index verification, when the
vacy preservation. A concrete ID auditing solution is presented by
attacked block is corrupted, the malicious server can spoof the
making use of the concept of asymmetric group key agreement.
client by computing another valid proof of the other blocks. Work
The PKG generates a private key σ ¼ hðIDÞx by computing for the
by Liu et al. (2014) investigated fine-grained updates over out-
user. For each block mi , the data owner computes σ i ¼
sourced data, but a strong assumption that the CSS remains honest
σ mi hðFileamekiÞη , where η 2 Zp is randomly picked by the user.
answering queries to file blocks is indispensable in the scheme.
Serial number embedded tag calculation makes dynamic
Barsoum and Hasan (2011) employed the MHT into replication
impossible.
data auditing, it constructs a MHT tree for each file copy and then
(4) CL-based homomorphic methods: A BLS-based scheme always
use the root of each tree to build a two-level hash tree. Such simple
requires a trusted entity to generate a private key for the user, thus
application of MHT does not solve the serial number verification
the data signature can be forged easily once the entity is
problem. To eliminate the defect, Liu et al. (2015b) designed to
compromised, and this defect is resolved by CL cryptography.
not only store the hash value in a node but also store other things,
Wang et al. (2013b) proposed the first certificateless public
including the level of the node, the maximum number of leaf
auditing method of verifying the integrity of data in a cloud. A
nodes reachable from this node, and a boolean value that indicates
homomorphic authenticable certificateless signature is designed
this node is to the right or left of its parent node on the verification
to enable a user/TPA to verify integrity without downloading the
path. Thus, a multi-copy MHT is constructed for efficient verifi-
entire data set, which is not possible with traditional certificate-
cation of data updates and block index authentication.
less signatures. In the setup stage of the scheme, the Key Gener-
(2) RSAL: RSAL is a validation model that checks the integrity of file
ation Center (KGC) selects X 2 Zp as its private key and computes
blocks and supports dynamic data updates. It is a tree-like hier-
Y ¼ g X 2 G as its public key. After receiving ID from the user, the
archical key-value store whose nodes are sorted according to their
PKG computes the partial key D ¼ hðIDÞX and sends it to the user. keys. Each node v in the skip list stores two pointers (right and
To generate the complete secret key, the user selects x 2 Zp as its down) represented by rgt(v) and dwn(v) so that a specific target in
another partial private key and computes y ¼ g x 2 G as its public the leaf node can be located in the search process. Erway et al.
key. Given a block mi with identifier id, the data owner computes (2009) designed the first dynamic PDP scheme based on RSAL. In
σ i ¼ umi Þðh2 ðIDkykidÞx D, which uses the complete secret key of this data structure, each node v stores rgt(v), dwn(v), and f(v)
the user. Although this solution solves the problem of user private generated by recursively applying a hash function to f(rgt(v)) and
key escrow by enabling the KGC to compute the partial key rather f(dwn(v)). The hash value of the root is the authentication infor-
than the complete key, it fails to resist replacement attack on users' mation stored by the client to verify the response from the server.
public keys. Another certificateless solution for data integrity The main drawback of the scheme is the lack of verifying the
verification was invented by He et al. (2015). The KGC of the integrity of a single block. Abo-alian et al. (2017) improved the
scheme generates the partial key in a relatively complex way as RSAL to support replication data updates. The improvement
follows: 1) KGC selects a random number η 2 Zp and computes combines all the data blocks of all replicas in the same data
T ¼ η P, S ¼ η þ X hðID;TÞmodp, where P is a generator of G. 2) structure, thus enabling efficient auditing of updates for all rep-
KGC sends the partial key ðT; SÞ. Then the user produces a tag for licas simultaneously. Furthermore, the use of MRASL can signifi-
block mi with identifier id by computing σ i ¼ ðS þ h2 ðIDkykYÞ xÞ cantly reduce the communication overhead for update verification
ðhðidÞ þ mi hðYÞÞ, which is provably secure against two types of of cloud data with multiple copies.
adversaries (replacing the user's public key and accessing the (3) IHT: IHT records the changes in data blocks and helps to generate
master key). Unfortunately, a TPA can obtain user data by solving hash values for each block during validation. The structure of IHT
the linear equation, thus data privacy is not ensured in this is similar to one-dimensional array, which contains index number
scheme. Subsequently, He et al. (2017) presented a NO, block number Bi, version number Vi and random value Ri. It
privacy-preserving certificateless PDP scheme by computing the was first introduced into cloud storage auditing by Zhu et al.
user's public key using the user's two part private key rather than a (2013), which reduces computation and communication overhead
part of the private key (Wang et al., 2013b; He et al., 2015). Se- by storing the IHT in TPA instead of CSS. Insert and delete oper-
curity analysis showed that the scheme is provably secure. ations cause the adjustment of Bi, thus incurring the tag
re-computation of the affected blocks. A follow-up plan is to
11
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
12
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
technology can support efficient key updating. 5.2. Supporting flexible storage selection
4.5. Blockchain technique for cloud data auditing In most cases, valuable data accounts for only a small fraction of the
total. Despite the necessity of protecting data privacy and security, it is
Blockchain is a specific data structure of combining data blocks in a uneconomical for a user to encrypt their entire files or store many rep-
chronological order (Yang et al., 2018; Bhaskaran et al., 2018). It is licas in a cloud server for all data. With the competition for the cloud
cryptographically guaranteed to be irrevocable and unforgeable. It can services market growing increasingly fierce, an alternative to attract
store simple data that can be verified within the system in a sequential customers to a CSS is to furnish cloud clients with selectable encrypted
manner. Using blockchain for cloud data sharing is an attractive research storage and selectable multiple-replica storage. This means that users are
topic due to remarkable advantages of blockchain, such as distributed allowed to encrypt part-sensitive information rather than a whole data
fault tolerance, non-tamperability and privacy protection. Inspired by set, and select the number of copies stored in the CSS according to the
this, Liang et al. (2017) used blockchain for data sharing and collabo- value of each part of the data.
ration in mobile healthcare applications. A tree-based data processing
and batching method are adopted to deal with large amounts of data 5.3. Supporting data deduplication
coming from mobile devices. However, this solution lacks the concrete
implementation process of the framework. Combining blockchain and (p, Deduplication is a technique to eliminate data replication, which has
t)-threshold secret sharing technique, Zheng et al. (2018) realized a been widely used in cloud storage to reduce storage space and upload
privacy-preserving data sharing scheme, which achieves data privacy by bandwidth. For cloud service providers, it is necessary to delete redun-
integrating Paillier cryptosystem. Before uploading files into the cloud, dant data by means of data deduplication while providing users with data
user encrypts the data and the key is allocated by the trusted CA. The integrity assurance (Hashem et al., 2015). There are several public
hash value of the data is recorded in blockchain to ensure that data is not auditing schemes supporting deduplication (Liu et al., 2017; Daniel and
corrupted. Unfortunately, the encryption of cloud data and the recording Vasanthi, 2017); however, their efficiency and security remain to be
of data hash values make it difficult to achieve data dynamically. improved.
Recently, Huang et al. (2017c) proposed a customized data sharing
scheme based on blockchain. In the scheme, the blockchain is adopted to 5.4. Supporting data recovery
ensure the fairness in incentive, and customization enables the classifi-
cation of rights over shared data. When users discover that their cloud storage data is corrupted, they
expect the damaged data to be fully recovered. Existing integrity tech-
4.6. Discussion niques achieving data recovery (Yuan and Yu, 2013; Ren et al., 2015) are
based on encoding methods, such as error correcting code or network
On the above analysis, the main way to study DIV is exploring how to coding, but they are suitable only for low damage percentage and require
apply and improve existing cryptography technology to meet different a large amount of computational overhead.
functions or enhance security or efficiency. In almost all methodologies,
signature construction is the foundation for ensuring data integrity 5.5. Supporting fog-based public cloud auditing
verification. Recently, combing ID/CL signatures with HVT is a prom-
ising alternative due to eliminating the need for certificate management, By using fog computing (Bonomi et al., 2012), cloud computing can
thereby enhancing the audit efficiency. With the pursuit of dynamic data, be extended to the edge of the network, which means that the user can
several verifiable data structures are designed to ensure that users can quickly and easily obtain the corresponding service. A cloud user may
effectively update and verify the uploaded data. In addition, access have multiple devices that can be used as fog nodes, and they may wish to
control exploiting IBE/ABE/PRE, key boycott methods, and cloud data use these devices to assist cloud servers with idle resources, thus reducing
sharing using blockchain are valuable topics in the field of DIV research. the fees paid to cloud service providers. Recently, Wang et al. (2017b)
proposed an anonymous and secure aggregation scheme in fog-based
5. Open problems and challenges public cloud computing, but it has a strong hypothesis that cloud
servers are safe and reliable, which does not tally with reality. Hence, a
In this section, we will present some open problems and challenges secure public auditing scheme remains an open problem for fog-based
that may be encountered in future research. These challenges have been cloud computing.
discussed in previous literature analysis but not in depth, which may
suggest some possible directions of DIV research for interested learners. 6. Conclusions
5.1. Data dynamics and traceability Due to the emergence of cloud computing, many users are moving the
big data from local to cloud for the convenient and flexible services
CSS must update data strictly according to users' requirements to provided by a CSS. However, users do not trust a CSS completely because
ensure the correctness and timeliness of data. However, the main existing they lack direct control over their big data. DIV technology has received
methods that support dynamics, such as MHT, RASL, and IHT, each have increasing attention because it is an essential requirement to ensure that
their own inevitable flaws. Both MHT and RASL require large amounts of users can trust a CSS. So far, there has been much research about DIV. All
supplementary validation information to ensure the validity of data up- schemes can be divided into private auditing and public auditing, and the
dates. IHT is only effective in modification, because the insertion and literature focuses on the latter as it releases resource-constrained users
deletion operations will destroy the sequence structure of the original from a heavy burden. The purpose of DIV is to enable users or TPAs to
table which will add extra overhead. In addition, the cloud only stores the audit the integrity of data in the CSS with low overhead on users' side.
most recent version of the data, while historic versions being abandoned. This paper introduces the concept and system model of DIV. Research
However, under certain circumstances, users not only require CSS to content for DIV research is provided by defining seven topics. Further, a
return the latest data block, but also hope to access historic versions, clear and progressive classification of the forthcoming DIV approaches is
which necessitates data traceability. Existing DIV solutions cannot ach- presented based on user mode and storage type. In addition, some
ieve traceability efficiently. Therefore, supporting data updates effi- principal topics and the corresponding technical means are unearthed.
ciently with realizing data traceability is a valuable issue in future Finally, we discuss open problems and challenges to offer some valuable
development trends. ideas for further investigation.
13
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
References Jaggi, S., Sanders, P., Chou, P.A., et al., 2005. Polynomial time algorithms for multicast
network code construction. IEEE Trans. Inf. Theor. 51 (6), 1973–1982.
Jiang, T., Chen, X., Ma, J., 2016. Public integrity auditing for shared dynamic cloud data
Abo-alian, A., Badr, N.L., Tolba, M.F., 2017. Integrity as a service for replicated data on
with group user revocation. IEEE Trans. Comput. 65 (8), 2363–2373.
the cloud. Concurr. Comput. Pract. Ex. 29 (4).
Khan, M.A., 2016. A survey of security issues for cloud computing. J. Netw. Comput.
Al-Riyami, S.S., Paterson, K.G., 2003. Certificateless public key cryptography. Asiacrypt
Appl. 71, 11–29.
2894, 452–473.
Li, Y., Fu, A., Yu, Y., et al., 2017. IPOR: an efficient IDA-based proof of retrievability
Armbrust, M., Fox, A., Griffith, R., et al., 2010. A view of cloud computing. Commun.
scheme for cloud storage systems. In: Communications (ICC), 2017 IEEE International
ACM 53 (4), 50–58.
Conference on. IEEE, pp. 1–6.
Ateniese, G., Burns, R., Curtmola, R., et al., 2007. Provable data possession at untrusted
Li, L., Yang, Y., Wu, Z., 2017. FMR-PDP: flexible multiple-replica provable data possession
stores. In: Proceedings of ACM CCS 2007. ACM, pp. 598–609.
in cloud storage. In: Computers and Communications (ISCC), 2017 IEEE Symposium
Ateniese, G., Pietro, R.D., Mancini, L.V., et al., 2008. Scalable and efficient provable data
on. IEEE, pp. 1115–1121.
possession. In: Proceedings of the 4th ICST Secure Comm. ACM.
Liang, X., Zhao, J., Shetty, S., et al., 2017. “Integrating blockchain for data sharing and
Aujla, G.S., Chaudhary, R., Kumar, N., et al., 2018. SecSVA: secure storage, verification,
collaboration in mobile healthcare applications. In: Personal, Indoor, and Mobile
and auditing of big data in the cloud environment. IEEE Commun. Mag. 56 (1),
Radio Communications (PIMRC), 2017 IEEE 28th Annual International Symposium
78–85.
on. IEEE, pp. 1–5.
Barsoum, A.F., Hasan, M.A., 2010. Provable Possession and Replication of Data over
Liu, C., Ranjan, R., Zhang, X., et al., 2013. Public auditing for big data storage in cloud
Cloud Servers, vol. 32. Centre For Applied Cryptographic Research (CACR),
computing–A survey. In: Computational Science and Engineering (CSE), 2013 IEEE
University of Waterloo. Report.
16th International Conference on. IEEE, pp. 1128–1135.
Barsoum, A.F., Hasan, M.A., 2011. On verifying dynamic multiple data copies over cloud
Liu, C., Chen, J., Yang, L.T., et al., 2014. Authorized public auditing of dynamic big data
servers. IACR Cryptol. ePrint Arch. 447.
storage on cloud with efficient verifiable fine-grained updates. IEEE Trans. Parallel
Bhaskaran, K., Ilfrich, P., Liffman, D., 2018. “Double-blind consent-driven data sharing on
Distr. Syst. 25 (9), 2234–2244.
blockchain. In: Cloud Engineering (IC2E), 2018 IEEE International Conference on.
Liu, C., Yang, C., Zhang, X., et al., 2015. External integrity verification for outsourced big
IEEE, pp. 385–439.
data in cloud and IoT: a big picture. Future Generat. Comput. Syst. 49, 58–67.
Blaze, M., Bleumer, G., Strauss, M., 1998. Divertible protocols and atomic proxy
Liu, C., Ranjan, R., Yang, C., et al., 2015. MuR-DPA: top-down levelled multi-replica
cryptography. In: Advances in Cryptology-EUROCRYPT’98. Springer, pp. 127–144.
merkle hash tree based secure public auditing for dynamic big data storage on cloud.
Boneh, D., Franklin, M., 2001. “Identity-based Encryption from the Weil Pairing,”
IEEE Trans. Comput. 64 (9), 2609–2622.
Advances in Cryptology—crypto 2001. Springer, pp. 213–229.
Liu, C.W., Hsien, W.F., Yang, C.C., et al., 2016. A survey of public auditing for shared data
Bonomi, F., Milito, R., Zhu, J., et al., 2012. Fog computing and its role in the Internet of
storage with user revocation in cloud computing. IJ Netw. Secur. 18 (4), 650–666.
things. In: Proceedings of MCC’12, Helsinki, Finland, Aug. 13-17, pp. 13–16.
Liu, X., Sun, W., Lou, W., et al., 2017. One-tag checker: message-locked integrity auditing
Brickell, E., Camenisch, J., Chen, L., 2004. Direct anonymous attestation. In: Proceedings
on encrypted cloud deduplication storage. IEEE Conf. Comput. Commun., IEEE 1–9.
of the 11th ACM Conference on Computer and Communications Security. ACM,
Lu, R., Zhu, H., Liu, X., et al., 2014. Toward efficient and privacy-preserving computing in
pp. 132–145.
big data era. IEEE Netw. 28 (4), 46–50.
Cai, H., Xu, B., Jiang, L., et al., 2017. IoT-based big data storage systems in cloud
Nachiappan, R., Javadi, B., Calheiros, R.N., et al., 2017. Cloud storage reliability for Big
computing: perspectives and challenges. IEEE Internet Things J. 4 (1), 75–87.
Data applications: a state of the art survey. J. Netw. Comput. Appl. 97, 35–47.
Chen, X., Shang, T., Kim, I., et al., 2014. A remote data integrity checking scheme for big
Peng, S., Zhou, F., Wang, Q., 2017. Identity-based public multi-replica provable data
data storage. IEEE Trans. Parallel Distr. Syst. 25 (9), 2234–2244.
possession. IEEE Access 5, 26990–27001.
Chen, H.F., Lin, B.G., Yang, Y., et al., 2014. Public batch auditing for 2M-PDP based on
Ren, Z., Wang, L., Wang, Q., et al., 2015. Dynamic proofs of retrievability for coded cloud
BLS in cloud storage. J. Cryptol. Res. 1 (4), 368–378.
storage systems. IEEE Trans. Serv. Comput. https://doi.org/10.1109/
Chen, Z., Fu, A., Xiao, K., et al., 2018. Secure and verifiable outsourcing of large-scale
TSC.2015.2481880.
matrix inversion without precondition in cloud computing. In: Communications
Rivest, R., Shamir, A., Adleman, L., 1978. A method for obtaining digital signatures and
(ICC), 2018 IEEE International Conference on. IEEE, pp. 1–6.
public key cryptosystems. Commun. ACM 21 (2), 120–126.
Curtmola, R., Khan, O., Burns, R., et al., 2008. MR-PDP: multiple-replica provable data
Shamir, A., 1984. Identity-based cryptosystems and signature schemes. ICrypto 84,
possession. In: The 28th International Conference on Distributed Computing Systems.
47–53.
IEEE, pp. 411–420.
Shen, J., Zhou, T., Chen, X., et al., 2017. Anonymous and traceable group data sharing in
Daniel, E., Vasanthi, N.A., 2017. LDAP: a lightweight deduplication and auditing protocol
cloud computing. IEEE Trans. Inf. Forensics Secur. https://doi.org/10.1109/
for secure data storage in cloud environment. Cluster Comput. 1–12. https://doi.org/
TIFS.2017.2774439.
10.1007/s10586-017-1382-6.
Shen, J., Shen, J., Chen, X., et al., 2017. An efficient public auditing protocol with novel
Erway, C.C., Küpçü, A., Papamanthou, C., et al., 2009. Dynamic provable data possession.
dynamic structure for cloud data. IEEE Trans. Inf. Forensics Secur. 12 (10),
In: Proceedings of the 16th Conf. Computer and Comm. Security. ACM, pp. 213–222.
2402–2415.
Etemad, M., Küpçü, A., 2013. Transparent, distributed, and replicated dynamic provable
Shen, W., Qin, J., Yu, J., et al., 2018. Enabling identity-based integrity auditing and data
data possession. In: International Conference on Applied Cryptography and Network
sharing with sensitive information hiding for secure cloud storage. IEEE Trans. Inf.
Security. Springer, Berlin, Heidelberg, pp. 1–18.
Forensics Secur. https://doi.org/10.1109/TIFS.2018.2850312.
Fu, A., Yu, S., Zhang, Y., et al., 2017. NPP: a new privacy-aware public auditing scheme
Singh, Y., Kandah, F., Zhang, W., 2011. A secured cost-effective multi-cloud storage in
for cloud data sharing with group users. IEEE Trans. Big Data. https://doi.org/
cloud computing,. In: 2010 IEEE Conference on INFOCOM WKSHPS. IEEE,
10.1109/TBDATA.2017.2701347.
pp. 619–624.
Fu, A., Li, Y., Yu, S., et al., 2018. DIPOR: an IDA-based dynamic proof of retrievability
Singh, S., Jeong, Y.S., Park, J.H., 2016. A survey on cloud computing security: issues,
scheme for cloud storage systems. J. Netw. Comput. Appl. 104, 97–106.
threats, and solutions. J. Netw. Comput. Appl. 75, 200–222.
Fu, A., Li, S., Yu, S., et al., 2018. Privacy-preserving composite modular exponentiation
Song, W., Wang, B., Wang, Q., et al., 2017. Tell me the truth: practically public
outsourcing with optimal checkability in single untrusted cloud server. J. Netw.
authentication for outsourced databases with multi-user modification. Inf. Sci. 387,
Comput. Appl. 118, 102–112.
221–237.
Goyal, V., Pandey, O., Sahai, A., et al., 2006. Attribute-based encryption for fine-grained
Sookhak, M., Talebian, H., Ahmed, E., et al., 2014. A review on remote data auditing in
access control of encrypted data. In: Proceedings of the 13th ACM Conference on
single cloud server: taxonomy and open issues. J. Netw. Comput. Appl. 43, 121–141.
Computer and Communications Security. ACM, pp. 89–98.
Sookhak, M., Yu, F.R., Zomaya, A.Y., 2018. Auditing big data storage in cloud computing
Hamming, R.W., 1950. Error detecting and error correcting codes. Bell Labs Tech. J. 29
using divide and conquer tables. IEEE Trans. Parallel Distr. Syst. 29 (5), 999–1012.
(2), 147–160.
Subashini, S., Kavitha, V., 2011. A survey on security issues in service delivery models of
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., et al., 2015. The rise of “big data” on cloud
cloud computing. J. Netw. Comput. Appl. 34 (1), 1–11.
computing: review and open research issues. Inf. Syst. 47, 98–115.
The MD5 Message-Digest Algorithm (RFC1321), URL https://tools.ietf.org/html/rfc1321.
He, D., Zeadally, S., Wu, L., 2015. Certificateless public auditing scheme for cloud-assisted
Tian, H., Chen, Y., Chang, C.C., et al., 2015. Dynamic-hash-table based public auditing for
wireless body area networks. IEEE Syst. J. 1–10. https://doi.org/10.1109/
secure cloud storage. IEEE Trans. Serv. Comput. 10 (5), 701–714.
JSYST.2015.2428620.
Wang, H., 2013. Proxy provable data possession in public clouds. IEEE Trans. Serv.
He, D., Kumar, N., Wang, H., et al., 2017. Privacy-preserving certificateless provable data
Comput. 6 (4), 551–559.
possession scheme for big data storage on cloud. Appl. Math. Comput. 314, 31–43.
Wang, H., 2015. Identity-based distributed provable data possession in multicloud
Hu, C., Li, W., Cheng, X., et al., 2017. A secure and verifiable access control scheme for
storage. IEEE Trans. Serv. Comput. 8 (2), 328–340.
big data storage in clouds. IEEE Trans. Big data. https://doi.org/10.1109/
Wang, Q., Wang, C., Ren, K., et al., 2011. Enabling public auditability and data dynamics
TBDATA.2016.2621106.
for storage security in cloud computing. IEEE Trans. Parallel Distr. Syst. 22 (5),
Huang, L., Zhang, G., Fu, A., 2017. Privacy-preserving public auditing for non-manager
847–859.
group. In: Communications (ICC), 2017 IEEE International Conference on. IEEE,
Wang, B., Li, B., Li, H., 2012. Oruta: privacy-preserving public auditing for shared data in
pp. 1–6.
the cloud. In: Proceedings of the 5th IEEE International Conference on Cloud
Huang, L., Zhang, G., Fu, A., 2017. Certificateless public verification scheme with privacy-
Computing. IEEE, pp. 295–302.
preserving and message recovery for dynamic group. In: Proceedings of the
Wang, B., Li, B., Li, H., 2012. Knox: privacy-preserving auditing for shared data with large
Australasian Computer Science Week, vol. 76. ACM.
groups in the cloud. In: Applied Cryptography and Network Security. Springer,
Huang, L., Zhang, G., Yu, S., et al., 2017. SeShare: secure cloud data sharing based on
pp. 507–525.
blockchain and public auditing. Concurrency Comput. Pract. Ex. https://doi.org/
Wang, B., Li, B., Li, H., 2013. Public auditing for shared data with efficient user revocation
10.1002/cpe.4359.
in the cloud. In: Proceedings of the 32nd IEEE INFOCOM. IEEE, pp. 2904–2912.
Huang, L., Zhang, G., Fu, A., 2018. Privacy-preserving public auditing for non-manager
group shared data. Wireless Pers. Commun. 100 (4), 1277–1294.
14
L. Zhou et al. Journal of Network and Computer Applications 122 (2018) 1–15
Wang, B., Li, B., Li, H., et al., 2013. Certificateless public auditing for data integrity in the Anmin Fu received the Ph.D. degree in Information Security
cloud. In: 2013 IEEE Conference on CNS. IEEE, pp. 136–144. from Xidian University in 2011. From 2017 to 2018, he was a
Wang, H., Wu, Q., Qin, B., et al., 2014. FRR: fair remote retrieval of outsourced private Visiting Research Fellow with the University of Wollongong,
medical records in electronic health networks. J. Biomed. Inf. 50, 226–233. Australia. He is currently an associate professor and supervisor
Wang, H., Wu, Q., Qin, B., et al., 2014. Identity-based remote data possession checking in of Ph.D. students of Nanjing University of Science and Tech-
public clouds. IET Inf. Secur. 8 (2), 114–121. nology, China. Dr Fu's research interest includes IoT Security,
Wang, H., He, D., Yu, J., et al., 2016. Incentive and unconditionally anonymous identity- Cloud Computing Security and Privacy Preserving. He has
based public provable data possession. IEEE Trans. Serv. Comput. 1, 1–11. published more than 50 technical papers, including interna-
Wang, Q., Peng, L., Xiong, H., et al., 2017a. Ciphertext-policy attribute-based encryption tional journals and conferences, such as IEEE Transactions on
with delegated equality test in cloud computing. IEEE Access 6, 760–771. Vehicular Technology, IEEE Transactions on Big Data, IEEE
Wang, H., Wang, Z., Domingo-Ferrer, J., 2017b. Anonymous and secure aggregation communications letters, Journal of Network and Computer
scheme in fog-based public cloud computing. Future Generat. Comput. Syst. https:// Applications, Computers & Security, Cluster Computing, Secu-
doi.org/10.1016/j.future.2017.02.032. rity and Communication Networks, IEEE ICC, and IEEE
Wu, G., Mu, Y., Susilo, W., et al., 2016. Privacy-preserving cloud auditing with multiple GLOBECOM.
uploaders. In: International Conference on Information Security Practice and
Experience. Springer International Publishing, pp. 224–237.
Yang, K., Jia, X., 2013. An efficient and secure dynamic auditing protocol for data storage
Shui Yu is currently a full Professor of School of Software,
in cloud computing. IEEE Trans. Parallel Distr. Syst. 24 (9), 1717–1726.
University of Technology Sydney, Australia. Dr Yu's research
Yang, K., Jia, X., Ren, K., 2015. Secure and verifiable policy update outsourcing for big
interest includes Security and Privacy, Networking, Big Data,
data access control in the cloud. IEEE Trans. Parallel Distr. Syst. 26 (12), 2751–2763.
and Mathematical Modelling. He has published two mono-
Yang, G., Yu, J., Shen, W., et al., 2016. Enabling public auditing for shared data in cloud
graphs and edited two books, more than 200 technical papers,
storage supporting identity privacy and traceability. J. Syst. Software 113, 130–139.
including top journals and top conferences, such as IEEE TPDS,
Yang, C., Chen, X., Xiang, Y., 2018. Blockchain-based publicly verifiable data deletion
TC, TIFS, TMC, TKDE, TETC, ToN, and INFOCOM. Dr Yu initi-
scheme for cloud storage. J. Netw. Comput. Appl. 103, 185–193.
ated the research field of networking for big data in 2013. His h-
Yu, S., 2016. Big privacy: challenges and opportunities of privacy study in the age of big
index is 32. Dr Yu actively serves his research communities in
data. IEEE Access 4, 2751–2763.
various roles. He served IEEE Transactions on Parallel and
Yu, J., Wang, H., 2017. Strong key-exposure resilient auditing for secure cloud storage.
Distributed Systems as an AE (2013–2015), and is currently
IEEE Trans. Inf. Forensics Secur. 12 (8), 1931–1940.
serving the editorial boards of Journal of Network and Com-
Yu, J., Ren, K., Wang, C., 2015. Enabling cloud storage auditing with key-exposure
puter Applications, IEEE Communications Surveys and Tuto-
resistance. IEEE Trans. Inf. Forensics Secur. 10 (6), 1167–1179.
rials (exemplary editor for 2014), IEEE Access, IEEE Internet of
Yu, J., Ren, K., Wang, C., 2016. Enabling cloud storage auditing with verifiable
Thing Journal, IEEE Communications Letters (exemplary editor
outsourcing of key updates. IEEE Trans. Inf. Forensics Secur. 11 (6), 1362–1375.
for 2016), and a number of other international journals.
Yu, S., Liu, M., Dou, W., et al., 2017. Networking for big data: a survey. IEEE Commun.
Moreover, he has organized several Special Issues either on big
Surv. Tutorials 19 (1), pp531–549.
data or cybersecurity. He has served more than 70 international
Yu, Y., Au, M.H., Ateniese, G., et al., 2017. Identity-based remote data integrity checking
conferences as a member of organizing committee, such as
with perfect data privacy preserving for cloud storage. IEEE Trans. Inf. Forensics
publication chair for IEEE Globecom 2015 and IEEE INFOCOM
Secur. 12 (4), 767–778.
2016 and 2017, TPC co-chair for IEEE BigDataService 2015,
Yuan, J., Yu, S., 2013. Proofs of retrievability with public verifiability and constant
IEEE ITNAC 2015, and General chair for ACSW 2017. He is a
communication cost in cloud. In: Proceedings of the 2013 International Workshop on
Senior Member of IEEE, a member of AAAS and ACM, the Vice
Security in Cloud Computing. ACM, pp. 19–26.
Chair of Technical Committee on Big Data of IEEE Communi-
Yuan, J., Yu, S., 2014. Efficient public integrity checking for cloud data sharing with
cation Society, and a Distinguished Lecturer of IEEE Commu-
multi-user modification. In: Proceedings of INFOCOM. IEEE, pp. 2121–2129.
nication Society.
Zha, Y.X., Luo, S.S., Bian, J.C., et al., 2015. Multiuser and multiple-replica provable data
possession scheme based on multi-branch authentication tree. J. Commun. 36 (11),
80–91.
Zhang, Y., Ni, J., Tao, X., et al., 2016. Provable multiple replication data possession with Mang Su received her Ph.D. degree in Cryptography from
full dynamics for secure cloud storage. Concurrency Comput. Pract. Ex. 28 (4), Xidian University in 2014. Currently, she is working as a
1161–1173. lecturer in School of Computer Science and Engineering at
Zhang, Y., Yu, J., Hao, R., et al., 2018. Enabling efficient user revocation in identity-based Nanjing University of Science and Technology. Her current
cloud storage auditing for shared big data. IEEE Trans. Dependable Secure Comput. research interests include access control and cloud computing.
https://doi.org/10.1109/TDSC.2018.2829880.
Zheng, B.K., Zhu, L.H., Shen, M., et al., 2018. Scalable and privacy-preserving data
sharing based on blockchain. J. Comput. Sci. Technol. 33 (3), 557–567.
Zhu, Y., Wang, H., Hu, Z., et al., 2010. Efficient provable data possession for hybrid
clouds. In: Proceedings of the 17th ACM Conference on Computer and
Communications Security. ACM, pp. 756–758.
Zhu, Y., Wang, H., Hu, Z., et al., 2013. Dynamic audit services for outsourced storage in
clouds. IEEE Trans. Serv. Comput. 6 (2), 227–238.
Zhu, Y., Fu, A., Yu, S., et al., 2018. New algorithm for secure outsourcing of modular
exponentiation with optimal checkability based on single untrusted server. In:
Communications (ICC), 2018 IEEE International Conference on. IEEE, pp. 1–6.
Zuo, C., Shao, J., Liu, J.K., et al., 2018. Fine-grained two-factor protection mechanism for
Boyu Kuang is currently a M.S. student in School of Computer
data sharing in cloud storage. IEEE Trans. Inf. Forensics Secur. 13 (1), 186–196.
Science and Engineering, Nanjing University of Science and
Technology, China. He received his B.S. degree in Computer
Science and Technology from Nanjing University of Science and
Lei Zhou is currently a Ph.D. student in School of Computer Technology, China, in 2016. His research interest includes
Science and Engineering, Nanjing University of Science and cloud computing.
Technology, China. She received her B.S. degree in Network
Engineering from Nanjing University of Science and Technol-
ogy, China, in 2015. His research interest includes cloud
computing.
15