0% found this document useful (0 votes)
40 views

Secure Data Analytics For Cloud-Integrated Internet of Things Applications

Industrial IOT and cloud research

Uploaded by

VikasThada
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Secure Data Analytics For Cloud-Integrated Internet of Things Applications

Industrial IOT and cloud research

Uploaded by

VikasThada
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/303532783

Secure Data Analytics for Cloud-Integrated Internet of Things


Applications

Article  in  IEEE Cloud Computing · March 2016


DOI: 10.1109/MCC.2016.30

CITATIONS READS
37 401

5 authors, including:

Heshan Kumarage Ibrahim Khalil


RMIT University RMIT University
7 PUBLICATIONS   135 CITATIONS    210 PUBLICATIONS   3,212 CITATIONS   

SEE PROFILE SEE PROFILE

Al-abdulatif Abdulatif Zahir Tari


RMIT University RMIT University
10 PUBLICATIONS   127 CITATIONS    368 PUBLICATIONS   4,039 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Performance of web-services View project

Privacy Preserving Recommender System View project

All content following this page was uploaded by Ibrahim Khalil on 03 October 2017.

The user has requested enhancement of the downloaded file.


Cloud-Assisted Internet of Things

Secure Data
Analytics for Cloud-
Integrated Internet of
Things Applications
Heshan Kumarage, Ibrahim Khalil, Abdulatif Alabdulatif,
Zahir Tari, and Xun Yi, RMIT University

This invited article describes how fully homomorphic


encryption with efficient data processing models can
help achieve data security and privacy in cloud-based
data analytics systems for the Internet of Things.

he Internet of Things (IoT) represents a paradigm for a smart world in


which ubiquitous and pervasive computing and communication occur
over a global network of heterogeneous and interconnected entities. The
key enabling technologies for IoT are wireless sensor networks (WSN),
machine-to-machine interfaces (M2M), RFID, and embedded sensor and
actuator designs that enable sensing, command/control interaction, data

46 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 © 2016 IEEE


Billing

Diagnosis Billing

Smart cities Industrial process/


Secure large-scale
and logistics machine monitoring
data analytics as
Classification a service
Aggregation

Statistics

Healthcare monitoring/assisted living Power systems/


and e-health applications smart grid

Environmental and meteorological monitoring

Figure 1. Application domains of cloud-integrated Internet of Things. These applications generate large volumes of data through
ubiquitous sensing. To effectively perform analytics and enable smart functionality, they require access to large storage and high
computational power that’s unavailable at the host application.

communication/transfer, and limited analytical ser- to be revolutionized by the adoption of this emerging
vices.1 The main application areas that are being paradigm of cloud-assisted IoT implementations.
disrupted through the enabling of IoT include power The integration of cloud and IoT in providing
systems and smart grid, e-health and assisted living effective service management and composition
systems, and large-scale industrial and environmen- makes it possible to
tal monitoring applications.1,2 With the advent of
constant Internet connectivity coupled with ubiqui- • deliver real-life IoT-based services in a distrib-
tous sensing, these applications are now producing uted, dynamic, and responsive manner;
vast amounts of data that have to be communicated, • provide an effective middle layer between IoT
stored, processed, and analyzed in a secure and ef- devices/objects and applications by hiding or
ficient manner to reach the targeted levels of smart abstracting the complex functionality of service
functionality envisioned through the IoT. implementation;
Cloud computing technology has also been • streamline, enhance, and innovate through
steadily growing, becoming a mature service platform communication, computation, and storage driv-
with worldwide spending upward of US$170 billion ers for IoT in achieving scalability, flexibility, in-
in 2014.3 Cloud resources provide a pervasive, con- teroperability, reliability, and efficiency.
venient, and reliable platform for high-performance
computing and storage that’s scalable and accessible Therefore, the cloud facilitates the implementation
on demand anywhere. The integration of cloud com- of complex, data-driven analytic models at low cost
puting services over large-scale IoT implementations in a dynamic and scalable manner by connecting IoT
will help achieve the required computational and sensing and data collection with powerful communi-
storage needs for effectively analyzing large amounts cation, computation, and storage. Figure 1 shows the
of generated data. This will enable smart function- main application domains that can directly benefit
ality over a wide variety of applications that are set from cloud-IoT integration.

M a r ch/A p r i l 2016 I EEE Clo u d Co m p u t i n g 47


Cloud-Assisted Internet of Things

Challenges applications in the IoT context. Personalized IoT ap-


However, the integration of IoT with cloud servic- plications such as e-health and assisted living envi-
es introduces some significant challenges—secu- ronments contain patient-specific private data that
rity and privacy being the foremost. Within the IoT shouldn’t be disclosed to any external parties. Data
paradigm, everything, including final dependability, sensitivity is paramount and client privacy should be
relies on the integrity of the data, which drives the fully ensured.
decision-making processes that enable smart func- Industrial-scale applications, such as machine/
tionality. The data should be secure and private as process monitoring and smart grid or power systems,
it’s generated, communicated, stored, and analyzed also generate large volumes of data that’s sensitive
within the complete integrated environment of pub- from an industrial espionage perspective or in en-
lic cloud facilities and IoT application domains. Most suring end system security. Exposure of this data
of the key IoT applications, which range from per- will reveal private customer information that could
sonal-level systems like e-health and assisted living be used to infer statistics on company performance
to industrial-level systems like smart grid and ma- and details of proprietary systems and networks em-
chine/process monitoring, require the generated data ployed, which adversaries could later use malicious-
to be both private and secure. Therefore, with cloud ly.2 This data therefore can’t be exposed to third
integration, major concerns arise owing to the lack parties and vulnerabilities to information leakage
of trust between cloud service providers (CSPs) and aren’t tolerable. Further, insider attacks are always
lack of detailed knowledge about service-level agree- a possibility. An insider in this context can be a CSP
ments (SLAs), as well as the location and therefore employee who exploits system vulnerabilities to gain
rights (such as jurisdiction) applicable to the data.4 data access or launch attacks with malicious intent
Data retention is also a concern for these applica- toward the client organization. Having multiple ten-
tions in instances where the CSP might keep deleted ants on a particular datacenter makes the data more
data in backups or retain data for some other unpub- vulnerable to these kind of threats and even more
lished or proprietary system-level reason based on its difficult to detect or prevent. Therefore, such cryp-
internal service implementations. Multitenancy of tographic methods are untenable for cloud-assisted
different client services in the same datacenter can IoT applications that mandate the data to be en-
also lead to the compromise of security and privacy crypted on the cloud. This opens up the possibility
through information leakage among different enti- to use innovative homomorphic encryption schemes
ties. Further, because the cloud platform is an open to securely facilitate the service provisioning of
and public infrastructure, it’s susceptible to various cloud-integrated IoT.
malicious attacks (for example, SQL injection, ses-
sion riding, side-channel, and denial-of-services or Cloud IoT Integration with Homomorphic
distributed DoS attacks) that directly affect the data Encryption
and the subsequent analytic processes.2,4 Homomorphic encryption enables operations to be
A promising approach to ensuring data security performed on the ciphertext itself. That is, specific
and privacy in the context of cloud-assisted (inte- mathematical operations can be performed on the
grated) IoT is therefore to use cryptographic meth- encrypted data, which in turn yields an encrypted
ods to encrypt the data prior to communication, result. This result will be equivalent to that obtained
storage, or analysis in public cloud facilities. With if the operation is performed on the plaintext with-
traditional cryptographic schemes, the generated out any encryption. Therefore, encrypting the data
data is encrypted at the point of origin for relevant using homomorphic encryption is an ideal solution
IoT applications and only the encrypted data is sent to facilitate cloud-based service provisioning for IoT
to the cloud for enabling relevant services. However, applications without compromising security or pri-
for any meaningful analysis of the data to be per- vacy. However, many of the proposed homomorphic
formed, the users/clients must share their encryp- encryption schemes support only a single math-
tion/decryption keys with the CSPs, giving them ematical operation at a time. For example, Pallier is
unrestricted access to that data. Although this will additively homomorphic, whereas RSA and Elgamal
guarantee the data’s security and integrity in tran- are multiplicatively homomorphic.5 In some other
sit, any proprietary or private information (such as methods, once an operation is performed you can’t
industrial monitoring details or sensitive patient perform further operations (the same or different)
health information) will be revealed to the CSP or on the data without compromising accuracy. These
any other third parties involved in providing the re- homomorphic methods support several mathemati-
quired services. This isn’t desirable in many of the cal operations, such as addition and multiplication,

48 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


but are limited due to accuracy degradation over way, the analysis can also be performed in a dis-
multiple iterations. tributed manner by focusing on subsets of the data
The earliest encryption method with homo- (for example, data from a single patient or a single
morphic properties was introduced in 1978 by WSN node) at a time. These subsets will therefore
Ronald Rivest,6 and the first fully homomorphic be natural partitions as determined by their original
encryption (FHE) scheme was proposed by Craig application environment. Hierarchies of virtual ma-
Gentry in 2009.7 FHE schemes support any num- chines in a cloud datacenter can therefore be used
ber of mathematical operations to be performed on to perform tasks, such as data clustering for detect-
the data without any accuracy degradation in the ing sensor data anomalies or patient-based diagno-
decrypted result. Many other FHE schemes have sis, with each virtual machine focusing on a small
been proposed, and they’ve become more stream- subset of data. This will cut down on the compu-
lined and lightweight with regard to computational tational complexity of the complete process by be-
efficiency.5 Most of these schemes support basic ing distributed over many nodes in performing the
arithmetic operations to be performed on numeri- analytical tasks over the encrypted data through a
cal data.8,9 pool of shared cloud resources. Although distribut-
FHE can be briefly outlined mathematically ing the analytic tasks will increase communication
as follows. Let (P, C, K, A E, A D) be an encryption overhead, it will be relatively small compared to the
scheme where A E and A D are the encryption and available memory and bandwidth resources of mod-
decryption algorithms, and P, C, and
K are the plaintext space, ciphertext
space, and key space, respectively. As-
suming the plaintext and ciphertext A major disadvantage of most of the
form a ring—that is, (P, ⊕p, ⊗p) and (C,
⊕c, ⊗c), then the encryption algorithm proposed FHE schemes is their high
A E is a map from the ring P to the ring
C in the form of AEk : P → C , where computational complexity.
k is either a secret key or a public key.
Then, ∀ a and b in P and ∀ k in K, if
the following statements hold true, the
encryption scheme is said to be fully homomorphic: ern cloud datacenters and manageable within the
application context.
• AEk ( a) ⊕c AEk (b) = AEk ( a ⊕ p b) Another area where homomorphic encryption
can be used in IoT is within relatively less complex
• AEk ( a) ⊗c AEk (b) = AEk ( a ⊗ p b) analytical tasks that don’t require near-real-time
analysis. A key example is the enabling of cloud-
A major disadvantage of most of the proposed based billing services for smart grid environments.
FHE schemes is high computational complexity. For The required services in this case only involve sim-
example, Gentry commented that if Google searched ple operations, such as aggregation (addition) and
the Web using the FHE scheme he proposed, it multiplication, that can be performed without much
would multiply the computational complexity by a overhead by using a less complex FHE scheme,
trillion times. Although later FHE versions reduced such as that proposed by Josep Domingo-Ferrer.8
this complexity and the accompanying overheads,5,9 Therefore, cloud-based analytical services for IoT
it still remains considerable, posing a significant applications in the domains of e-health, industrial
barrier for practical adoption. monitoring, and smart grids naturally support the
However, in the context of cloud-integrated use of FHE schemes because they
IoT, there exists the unique possibility of adapting
FHE in a distributed data processing framework for • primarily generate numerical data for analysis,
numerical sensor data analytics. A key advantage • generate data in a distributed manner and the
in this context is the availability of large computa- data can easily be adapted in a distributed pro-
tional resources in the cloud for analysis of primar- cess for analysis,
ily numerical data that’s generated in a distributed • communicate data as it’s sensed for faster ana-
manner (for example, WSNs for industrial monitor- lytic times, and
ing in the IoT or patient health monitoring devices • require analytical tasks, such as billing, that are
in the IoT). Because the data is generated in this less complex to implement and less time-critical.

M a r ch/A p r i l 2016 I EEE Clo u d Co m p u t i n g 49


Cloud-Assisted Internet of Things

Smart meter

Company
Services
Consumer

Billing Data analytics Grid operator

• Aggregate data • Usage average


• Generate bills • Statistical consumption
information
Consumer
• Adition, subtraction, • Adition, subtraction,
multiplication, division multiplication, division

Consumer
Energy supplier

Power supply

Figure 2. Secure cloud data analytics for smart grid. Billing services, on-demand usage monitoring, and consumption
information statistics for IoT smart grid applications involve relatively simple mathematical operations and can be implemented
securely using lightweight homomorphic encryption such as an additive and multiplicative privacy mechanism.8

Here, we introduce three analytical services infrastructure. Compromise in the form of security
that can be implemented securely using FHE in the attacks and unauthorized data exposure can lead to
cloud for different IoT applications and propose ru- energy disruptions and/or energy theft. In addition,
dimentary frameworks and technological solutions specific information about customer behavior can be
for performing analytical tasks securely and depend- inferred through close analysis of electricity usage
ably in cloud. monitoring and can reveal industrial clients’ pro-
prietary information or the personal behavioral and
Secure Cloud-Based Billing and habits of individual clients.
Consumption Monitoring for Smart Grid Therefore, the implementation of electronic
Service provisioning in smart grid systems is a vi- billing and consumption monitoring services on the
able application for secure cloud integration us- cloud must be accompanied by strong encryption
ing an FHE-based security model. Smart grid is models that don’t reveal the data to third parties,
an area under IoT where the stated goals of smart while still allowing the necessary operations to be
functionality through ubiquitous sensing and com- performed. Specific actions required to implement
munication are set to revolutionize the power indus- these services include simple operations such as
try. It’s also one of the earliest domains where that aggregation, addition, subtraction, and multiplica-
functionality is being implemented with the wide- tion. Therefore, a lightweight FHE scheme8 can be
spread adoption of smart meters. The global smart used as opposed to a more computationally expen-
grid data volume is set to increase exponentially sive model.7
from 10,780 terabytes in 2010 to 75,200 terabytes Figure 2 gives a general overview of the entities
in 2015.10 Within this context, cloud computing can involved, communication pathways, and tasks to be
serve as a global infrastructure to implement client performed in the cloud with the necessary mathe-
services such as consumption monitoring and bill- matical operations performed on encrypted data.
ing. However, as discussed previously, the data’s We propose using Domingo-Ferrer’s FHE
security and integrity should be preserved when scheme for this application. 8 This symmetric-key
performing data analytic tasks on the public cloud encryption scheme performs the basic arithme-

50 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


tic operations of addition, subtraction, multi- make the application an ideal candidate for IoT inte-
plication in a secure and efficient manner, and gration with cloud services in a secure manner.
is proven to be secure over different ciphertext
threats. Application of the scheme produces a Secure Cloud-Based Sensor Data Anomaly
tuple of encrypted values for each plaintext value Detection
that can be used to perform multiple operations Sensor data anomaly detection is another applica-
with other tuples of encrypted data. The mathe- tion that’s compatible with secure cloud integration
matical operations can be carried out component- using FHE models. Although many solutions to se-
wise between terms with the same degree. The cure WSNs exist,11,12 we focus exclusively on securing
probability of any two keys decrypting a random sensed data through FHE with regard to facilitating
ciphertext into the original plaintext is given by service provisioning over the cloud. IoT applications
O((logm)/m) where m is a public parameter that in a variety of domains gather large volumes of indus-
should be as large as possible. Therefore, the trial- or environmental/meteorological-monitoring
probability of unauthorized decryption is kept data through ubiquitous sensing. Most of these ap-
high, with the probability of any two keys yield- plications use large-scale WSNs consisting of small
ing different plaintext for the same ciphertext nodes collaborating to sense and communicate data.
also kept high by the use of subset key pairs that A key analytics application common to these imple-
are kept at a high value. mentations is effective anomaly detection. Data
The proposed framework and the tasks required anomalies should be determined dynamically through
to achieve secure smart grid billing and consump- an efficient and accurate process to identify events of
tion monitoring in the cloud can be summarized as interest, detect attacks, and ensure the integrity of
follows: the sensed data for decision support. This mandates
adapting different unsupervised learning models,
• Smart meters encrypt the usage data using which are typically of high computational complex-
Domingo-Ferrer’s FHE scheme8 and send it to ity. However, given the nature of a WSN deployment,
the cloud in fixed time intervals. the data is generated in a distributed manner, sup-
• Consumer identity is used as a unique identifier porting the viability of having a distributed anomaly-
to aggregate the consumption information of a detection process over a distributed data-processing
single client. framework. Such a distributed anomaly-detection
• Consumers and energy suppliers communicate model will lessen the amount of data processed at a
with cloud-based services through secure chan- given time, reducing computational complexity and
nels (such as virtual private networks). thereby enabling the use of an FHE scheme. There-
• Public cloud facilities aggregate the encrypted fore, fully homomorphic encryption schemes with
data and perform operations as required to pro- higher capabilities and support for most mathemati-
vide services. cal operations can be adapted over a distributed set of
• Arithmetic operations are performed on the en- cloud resources that cater to the provisioning of the
crypted data for specific services over specified required anomaly-detection service.
time periods (for example, addition and multipli- We propose a distributed anomaly-detection
cation for billing). model based on fuzzy data modeling13 as unique-
• Consumption averages can be calculated by ly suitable for implementing secure cloud-based
multiplying aggregated total usage with pre- analytics as a service complemented with FHE.
defined values for different time periods. Elsewhere, fuzzy c-means clustering is adapted
• Usage comparisons can also be performed with for iterative anomaly detection over a hierarchi-
secure subtraction operations through FHE. cal data-processing framework that distributes the
• Access control mechanisms can be used to ac- main data analytic tasks to composition nodes re-
cess records for different clients (such as con- flecting the WSN environment.13 The key factors
sumers or energy suppliers) based on predefined contributing to the viability of an approach based
privileges with a policy for least information on this earlier work13 for FHE-based cloud adapt-
exposure. ability are:

Therefore, an FHE-based secure cloud analyt- • It consists of a hierarchical topology of distrib-


ics service can be achieved through the adoption of uted data-processing nodes that can be replicat-
a lightweight FHE scheme for smart grid data. The ed in virtual machines on a cloud datacenter for
lightweight tasks that are involved in this context efficient analytics for encrypted data.

M a r ch/A p r i l 2016 I EEE Clo u d Co m p u t i n g 51


Cloud-Assisted Internet of Things

Industrial monitoring L1 Ln
VM
Lm Ln Lm Ln R

L1 Ln VM VM
R Lm Ln Lm Ln

L1 Ln VM VM VM VM

Environmental monitoring Lm Ln Lm Ln Virtual machines

Network of sensors in
Hierarchical network Hierarchical topology replicated
the Internet of Things
topology for distributed in VMs for distributed data
anomaly detection processing

Healthcare monitoring

Figure 3. Anomaly-detection services for wireless sensor network-based IoT monitoring. Large-scale WSNs can be arranged
on a hierarchical topology and detection of anomalies performed over different network tiers. Unsupervised anomaly-detection
methods can then be employed in a distributed manner over subsets of data to reduce complexity of FHE-based approaches by
replicating the hierarchical topology on cloud virtual machines.

• It provides an unsupervised means of anomaly plication concerned. Therefore, we can adapt the
detection without requiring prior knowledge earlier work13 to accommodate homomorphic en-
while being adaptable and scalable in a dynamic cryption over a distributed data-processing model
manner. in the cloud without changing the underlying data
• The granular analysis of a distributed process is analytic model.
naturally suitable for the application domain of Having a distributed data clustering approach
discrete sensor nodes that collect data in WSN- based on fuzzy c-means13 for anomaly detection
based IoT applications. requires a more powerful FHE scheme than the
• The operations in the clustering process can be scheme proposed for smart grid applications. A
broken down into simple mathematical opera- scheme such as Domingo-Ferrer’s, though light-
tions that are straightforward to implement on weight, isn’t applicable because it has limited ca-
encrypted data. pabilities and offers minor support for different
• It supports efficient application of FHE opera- mathematical operations. Therefore, we need a
tions owing to the distributed processing of sub- more complex but capable scheme that leverages a
sets of data over a hierarchy. distributed data-processing approach13 and the sig-
• Each processing node in the model deals with nificant computational power available in cloud fa-
a small subset of data, reducing computational cilities. We propose using the functions available in
costs. the HElib library (https://github.com/shaih/HElib)
• The iterative anomaly-detection process fur- in the implementation of our distributed anomaly-
ther reduces communication and computational detection model for analyzing encrypted data as a
overhead. service in cloud.
HElib is a functional software implementation of
Figure 3 gives a graphical overview of the pro- homomorphic encryption; however, it only supports
posed model for sensor data anomaly detection in operations on encrypted integers. To overcome this
the cloud using the distributed data clustering ap- limitation, we use the IEEE Standard for Floating-
proach. Figure 4 gives a more detailed view, in- Point Arithmetic (IEEE 754)14 to perform floating-
cluding the different tasks performed on different point computation domain operations in an integer
granular levels in a hierarchical topology. Anoma- computation domain, making the required data anal-
lies are evaluated at different levels in providing ysis possible with HElib. Therefore, floating-point
more fine-grained detection capability to the ap- arithmetic can be performed in a homomorphic

52 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Final level analysis
• Final data clustering
• Final anomaly detection VM

Secondary level analysis


• Secondary data clustering and refining VM VM
• Secondary anomaly detection

Local level analysis Cloud datacenter


VM VM VM VM
• Local data clustering
• Local anomaly detection

Figure 4. Hierarchical framework for distributed anomaly detection. Different analytic tasks such as clustering
and detection with discrete operations are performed in a distributed manner at different levels over subsets
of data.

Table 1. Accuracy and execution time variation for a float-point multiplication operation.

Value representation Maximum number of operations supported Execution time (s)

X.X Up to 10 multiplications with accuracy X.X 16.778125

X.XX Up to 10 multiplications with accuracy X.XX 16.650898

X.XXX Up to 8 multiplications with accuracy X.XXX 13.719173

X.XXXX Up to 8 multiplications with accuracy X.XXXX 13.009210

X.XXXXX Up to 6 multiplications with accuracy X.XXXXX 9.361198

manner by converting the representation of float- our third key IoT application domain of focus. The
point numbers to its integer representation based on availability of an effective system to detect abnor-
the IEEE 754 standard. We performed experiments malities in patient health monitoring records as
to validate this approach; Table 1 gives some prelimi- well as to classify patients based on the shared
nary results. The table shows the accuracy over dif- similarity of their health records is extremely at-
ferent float-point levels with the associated execution tractive for many applications. Such a system will
time in seconds for a multiplication operation using help healthcare professionals make better deci-
HElib functions based on the described approach. sions regarding patient health and will improve
We used a multiplication approach as most data ana- patient well-being through quick detection of ab-
lytic operations can be reconstituted as a combina- normalities that can be referred to doctors and
tion of addition and multiplication operations, with other medical professionals. The IoT vision for e-
multiplication being the more complex to implement health and assisted living environments includes
and execute in the current context. modern body sensor networks that constantly
Thus, the HElib library is a feasible measure to monitor different physiological parameters of
encrypt the data and perform cloud-based anomaly people with chronic illnesses.1,15 This data is typi-
detection for IoT sensor network applications. The cally stored in a private database that medical pro-
high overheads can be managed by employing cloud fessionals can refer to. Secure cloud integration
resources in a distributed manner over different sub- for these applications will significantly enhance
sets of the data. the provision of a more efficient service as well as
more accurate diagnosis and abnormality detection
Secure Patient Classification and Diagnosis through an automated process implemented over
for E-Health Systems cloud resources.
Cloud-assisted patient classification and abnormal- Given that the EHRs contain sensitive informa-
ity diagnosis in electronic health records (EHRs) is tion, exposure of this data to any third party other

M a r ch/A p r i l 2016 I EEE Clo u d Co m p u t i n g 53


Cloud-Assisted Internet of Things

than the patient and relevant medical professional defines the similarity measure between a particular
must be severely restricted. Therefore, the data data point and the cluster center. The Euclidean dis-
must be encrypted prior to analysis by a third party tance can be chosen as the similarity measure be-
such as a CSP. cause it provides an effective similarity score with
Clustering algorithms are useful in classify- comparatively low computational complexity. There-
ing patients into groups based on EHR similarity. fore, a fuzzy partitioning of X is derived by the rep-
We propose using the fuzzy c-means algorithm de- resenting matrix U = [wij]. More details are available
scribed earlier.13 Several features make this algo- elsewhere.13
rithm desirable in the current context: We can use the fuzzy clustering-based anoma-
ly-detection procedure in the application of medical
• It provides fully unsupervised classification. diagnosis through the detection of abnormalities
• It provides a soft partitioning rather than a in observation data, but without the distributed
hard (fixed) partitioning of the data, which can aspects. However, even without the process be-
then be reviewed by a medical professional. ing distributed, the involved datasets will be small
(Importantly, no final decision is made and a enough to keep computational complexity low
matrix of scores is given stating the member- because abnormality detection is performed on
ship of a particular patient to a given cluster.) a per-person basis. That is, at any given instance
• It’s scalable and adaptive, so can be implement- only the data of a particular person will be subject
ed for a large number of patients using distrib- to the clustering and anomaly-detection proce-
uted cloud resources. dure. Anomalies can be evaluated by performing
• It includes an iterative process with simple ba- the same set of relevant homomorphic operations
sic mathematical operations that can be imple- used for sensor data anomaly detection on the en-
mented in a homomorphic manner. crypted EHRs that will be sent to the CSP. Given
the similarity of the process, we suggest using the
The fuzzy clustering of a multidimensional data HElib FHE system to encrypt the data and per-
space such as is available in EHR datasets can be form the analytic tasks in cloud datacenters. Fig-
explained as follows.13 For a set of observations X = ure 5 gives an overview of the process with the
[X1, X2, . . ., Xn], where each data point Xi is an m- relevant tasks and accompanying operations to be
dimensional observation, and Xi = (xi1, xi2, . . ., xim), performed in a homomorphic manner.
a group of fuzzy clusters F1, F2, . . ., Fk is a subset of
all possible fuzzy subsets of X where
ecure and efficient data-processing frameworks
• The summation of weights for a particular data are vital for cloud-assisted IoT applications. The
point add up to 1; that is, large volumes of generated data in such applications
k mandate the use of large computational and storage
∑w
j=1
ij = 1. resources for effective analytical tasks in the provi-
sioning of services. In this context, the application
• Each cluster has at least one data point with of fully homomorphic encryption schemes capable
nonzero weight, and doesn’t have any point with of performing analytic tasks on ciphertext is of vital
a weight of 1. Therefore, importance.
n The frameworks we’ve proposed can set the
0< ∑w
i=1
ij <n. path and open avenues for further research on the
nature of the data-processing models, the composi-
tion of analytics tasks, and the criteria for encryption
In this context, the fuzzy c-means algorithm mini- schemes that will be vital to achieve effective service
mizes an objective function (denoted Om), which is provisioning of cloud-assisted IoT applications.
the weighted sum of squared errors,
References
k n
1. L. Atzori, A. Iera, and G. Morabito, “The In-
∑∑
2
Om (U, F; X ) = w ijm xi − c j ,1 < m < ∞ ,
j=1 i=1
A ternet of Things: A Survey,” Computer Networks,
vol. 54, no. 15, 2010, pp. 2787–2805.
where C = (c1, c2, . . ., ck) is a vector of unknown 2. R. Roman, J. Zhou, and J. Lopez, “On the Fea-
cluster prototypes and cj ∈ . Here wij is the mem- tures and Challenges of Security and Privacy in
bership degree for data point Xi in the jth cluster. A Distributed Internet of Things,” Computer Net-

54 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Services

Retrieve and Diagnosis service


manage electronic for healthcare
health records Patient classification Diagnosis professionals

• Identification • Abnormality detection


• Classification

• Addition, subtraction, • Addition, subtraction,


multiplication, division multiplication, division

Community health Network-based


monitoring community electronic
health records

Figure 5. Abnormality detection and patient classification in electronic health records. Body sensor networks and other health
monitoring IoT applications such as assisted living environments generate large volumes of sensitive, private health records that
should be processed without any third-party exposure. Tasks such as patient classification and abnormality detection can be
performed with fully homomorphic encryption and data clustering models which involve basic mathematical operations.

works, vol. 57, no. 10, 2013, pp. 2266–2279. Encrypted Data,” Comm. ACM, vol. 53, no. 3,
3. IHS Technology, “Cloud-Related Spending by 2010, pp. 97–105.
Businesses to Triple from 2011 to 2017,” IHS, 10. S. Sudip, “Smart Grid Data Analytics Market
news release, 14 Feb. 2014 http://press.ihs.com/ to Triple by 2022,” Transparency Market
press-release/design-supply-chain/cloud-related Research, press release, 30 Nov. 2015; www
-spending-businesses-triple-2011-2017. .transparencymarketresearch.com/pressrelease/
4. A. Botta et al., “Integration of Cloud Computing smart-grid-data-analytics.htm.
and Internet of Things: A Survey,” Future Gen- 11. B. Tian et al., “Self-Healing Key Distribution
eration Computer Systems, vol. 56, Mar. 2016, Schemes for Wireless Networks: A Survey,”
pp. 684–700. Computer J., vol. 54, no. 4, 2011, pp. 549–569.
5. C. Fontaine and F. Galand, “A Survey of Homo- 12. B. Tian et al., “A Mutual-Healing Key Distribu-
morphic Encryption for Nonspecialists,” EURA- tion Scheme in Wireless Sensor Networks,” J.
SIP J. Information Security, Jan. 2007, article Network and Computer Applications, vol. 34, no.
15; http://dx.doi.org/10.1155/2007/13801. 1, 2011, pp. 80–88.
6. R.L. Rivest, A. Shamir, and L. Adleman, “A 13. H. Kumarage et al., “Distributed Anomaly De-
Method for Obtaining Digital Signatures and tection for Industrial Wireless Sensor Networks
Public-Key Cryptosystems,” Comm. ACM, vol. Based on Fuzzy Data Modelling,” J. Parallel and
21, no. 2, 1978, pp. 120–126. Distributed Computing, vol. 73, no. 6, 2013, pp.
7. C. Gentry, “Fully Homomorphic Encryption Us- 790–806.
ing Ideal Lattices,” Proc. 41st Ann. ACM Symp. 14. IEEE Std. 754-2008, Floating-Point Arithmetic,
Theory of Computing, 2009, pp. 169–178. IEEE, 2008.
8. J. Domingo-Ferrer, “A Provably Secure Additive 15. A.R.M. Forkan et al., “A Context-Aware Ap-
and Multiplicative Privacy Homomorphism*,” proach for Long-Term Behavioural Change De-
Information Security, LNCS 2433, Springer, tection and Abnormality Prediction in Ambient
2002, pp. 471–483. Assisted Living,” Pattern Recognition, vol. 48,
9. C. Gentry, “Computing Arbitrary Functions of no. 3, 2015, pp. 628–641.

M a r ch/A p r i l 2016 I EEE Clo u d Co m p u t i n g 55


Cloud-Assisted Internet of Things

Heshan Kumarage is a research associate in computer science from RMIT University. Contact him
the School of Computer Science at RMIT University, at [email protected].
Melbourne, Australia. His research interests include
distributed systems, data mining for network security, Zahir Tari is a professor in the School of Com-
information theory, and data science. Kumarage has puter Science at RMIT University, Melbourne, Aus-
a PhD in computer science from RMIT University. tralia. His research interests include core aspects of
Contact him at [email protected]. large-scale distributed systems, such as performance,
security and reliability. Tari has a PhD in artificial
Ibrahim Khalil is an associate professor in the intelligence from the University of Grenoble, France.
School of Computer Science and Information Tech- Contact him at [email protected].
nology at RMIT University, Melbourne, Australia. His
research interests include anonymous networks, qual- Xun Yi is a professor in the School of Computer Sci-
ity of service, wireless sensor networks, and remote ence at RMIT University, Australia, where he’s a mem-
healthcare. Khalil has a PhD from the University of ber of the Cyberspace and Security Group. His research
Berne, Switzerland. Contact him at ibrahim.khalil interests include privacy protection, cloud security, pri-
@rmit.edu.au. vacy preserving data mining, and applied cryptography.
Yi has a PhD in electronic engineering from Xidian
Abdulatif Alabdulatif is a PhD student in University. Contact him at [email protected].
the School of Computer Science and Information
Technology at RMIT University, Melbourne, Australia.
His research interests include cryptography techniques,
distributed systems and networks, data mining, and Selected CS articles and columns are also available
for free at http://ComputingNow.computer.org.
cloud computing. Alabdulatif has a master’s degree in

ADVERTISER INFORMATION

Advertising Personnel Southwest, California:


Mike Hughes
Marian Anderson: Sr. Advertising Coordinator Email: [email protected]
Email: [email protected] Phone: +1 805 529 6790
Phone: +1 714 816 2139 | Fax: +1 714 821 4010
Southeast:
Sandy Brown: Sr. Business Development Mgr. Heather Buonadies
Email [email protected] Email: [email protected]
Phone: +1 714 816 2144 | Fax: +1 714 821 4010 Phone: +1 973 304 4123
Fax: +1 973 585 7071
Advertising Sales Representatives (display)
Advertising Sales Representatives (Classified Line)
Central, Northwest, Far East:
Eric Kincaid Heather Buonadies
Email: [email protected] Email: [email protected]
Phone: +1 214 673 3742 Phone: +1 973 304 4123
Fax: +1 888 886 8599 Fax: +1 973 585 7071

Northeast, Midwest, Europe, Middle East: Advertising Sales Representatives (Jobs Board)
Ann & David Schissler
Email: [email protected], [email protected]
Phone: +1 508 394 4026 Heather Buonadies
Fax: +1 508 394 1707 Email: [email protected]
Phone: +1 973 304 4123
Fax: +1 973 585 7071

56 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g

View publication stats

You might also like