ICM Module 2
ICM Module 2
Data, social, and mobile technologies. This module also focuses on cloud computing and its
essential characteristics. Then, this module focuses on cloud service models and cloud
deployment models. Additionally, this module focuses on Big Data analytics. Further, this module
focuses on social networking and mobile computing. Lastly, this module focuses on the key
characteristics of third platform infrastructure and the key imperatives for transforming to the
third platform.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 1
This lesson covers the definition of cloud computing and the essential characteristics of cloud
computing. This lesson also covers cloud service models and cloud deployment models.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 2
The National Institute of Standards and Technology (NIST)—a part of the U.S. Department of
Commerce—in its Special Publication 800-145 defines cloud computing as “a model for enabling
convenient, on-demand network access to a shared pool of configurable computing resources
(e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and
released with minimal management effort or service provider interaction.”
The term “cloud” originates from the cloud-like bubble that is commonly used in technical
architecture diagrams to represent a system, such as the Internet, a network, or a compute
cluster. In cloud computing, a cloud is a collection of IT resources, including hardware and
software resources that is deployed either in a single data center, or across multiple
geographically-dispersed data centers that are connected over a network. A cloud infrastructure is
built, operated, and managed by a cloud service provider. The cloud computing model enables
consumers to hire IT resources as a service from a provider. A cloud service is a combination of
hardware and software resources that are offered for consumption by a provider. The cloud
infrastructure contains IT resource pools, from which resources are provisioned to consumers as
services over a network, such as the Internet or an intranet. Resources return to the pool when
released by consumers.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 3
The cloud model is similar to utility services such as electricity, water, and telephone. When
consumers use these utilities, they are typically unaware of how the utilities are generated or
distributed. The consumers periodically pay for the utilities based on usage. Similarly, in cloud
computing, the cloud is an abstraction of an IT infrastructure. Consumers simply hire IT resources
as services from the cloud without the risks and costs associated with owning the resources.
Cloud services are accessed from different types of client devices over wired and wireless network
connections. Consumers pay only for the services that they use, either based on a subscription or
based on resource consumption.
When organizations use cloud services, their IT infrastructure management tasks are reduced to
managing only those resources that are required to access the cloud services. The cloud
infrastructure is managed by the provider, and tasks such as software updates and renewals are
also handled by the provider. The figure on the slide illustrates a generic cloud computing
environment.
The figure on the slide illustrates a generic cloud computing environment. The cloud provides
various types of hardware and software services that are accessed by consumers from different
types of client devices over wired and wireless network connections. The figure includes some
virtual components for relevance and accuracy. Virtualization will be introduced in Module 3, ‘Data
Center Environment’ and covered in detail in relevant sections of the later modules.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 4
In SP 800-145, NIST specifies that a cloud infrastructure should have the five essential
characteristics described below:
• Broad network access: “Capabilities are available over the network and accessed through
standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g.,
mobile phones, tablets, laptops, and workstations).” – NIST
• Resource pooling: “The provider’s computing resources are pooled to serve multiple
consumers using a multi-tenant model, with different physical and virtual resources
dynamically assigned and reassigned according to consumer demand. There is a sense of
location independence in that the customer generally has no control or knowledge over the
exact location of the provided resources but may be able to specify location at a higher level of
abstraction (e.g., country, state, or datacenter). Examples of resources include storage,
processing, memory, and network bandwidth.” – NIST
• Rapid elasticity: “Capabilities can be rapidly and elastically provisioned, in some cases
automatically, to scale rapidly outward and inward commensurate with demand. To the
consumer, the capabilities available for provisioning often appear to be unlimited and can be
appropriated in any quantity at any time.” – NIST
• Measured service: “Cloud systems automatically control and optimize resource use by
leveraging a metering capability at some level of abstraction appropriate to the type of service
(e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be
monitored, controlled, and reported, providing transparency for both the provider and
consumer of the utilized service.” – NIST
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 5
A cloud service model specifies the services and the capabilities that are provided to consumers.
In SP 800-145, NIST classifies cloud service offerings into the three primary models listed below:
Cloud administrators or architects assess and identify potential cloud service offerings. The
assessment includes evaluating the services to be created and upgraded, the necessary feature
set for each service, and the service level objectives (SLOs) of each service aligned to consumer
needs and market conditions. SLOs are specific measurable characteristics such as availability,
throughput, frequency, and response time. They provide a measurement of performance of the
service provider. SLOs are key elements of a service level agreement (SLA), which is a legal
document that describes items such as what service level will be provided, how it will be
supported, service location, and the responsibilities of the consumer and the provider.
Note: Many alternate cloud service models based on IaaS, PaaS, and SaaS are defined in various
publications and by different industry groups. These service models are specific to the cloud
services and capabilities that are provided. Examples of such service models are Backup as a
Service (BaaS), Desktop as a Service (DaaS), Test Environment as a service (TEaaS), and
Disaster Recovery as a Service (DRaaS). However, these models eventually belong to one of the
three primary cloud service models.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 6
Infrastructure as a Service: “The capability provided to the consumer is to provision
processing, storage, networks, and other fundamental computing resources where the
consumer is able to deploy and run arbitrary software, which can include operating
systems and applications. The consumer does not manage or control the underlying
cloud infrastructure but has control over operating systems, storage, and deployed
applications; and possibly limited control of select networking components (for example,
host firewalls).” – NIST
IaaS pricing may be subscription-based or based on resource usage. The provider pools the
underlying IT resources and they are typically shared by multiple consumers through a multi-
tenant model. IaaS can even be implemented internally by an organization, with internal IT
managing the resources and services.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 7
Platform as a Service: “The capability provided to the consumer is to deploy onto the
cloud infrastructure consumer-created or acquired applications created using
programming languages, libraries, services, and tools supported by the provider. The
consumer does not manage or control the underlying cloud infrastructure including
network, servers, operating systems, or storage, but has control over the deployed
applications and possibly configuration settings for the application-hosting environment.”
– NIST
In the PaaS model, a cloud service includes compute, storage, and network resources along with
platform software. Platform software includes software such as OS, database, programming
frameworks, middleware, and tools to develop, test, deploy, and manage applications.
Most PaaS offerings support multiple operating systems and programming frameworks for
application development and deployment. PaaS usage fees are typically calculated based on
factors, such as the number of consumers, the types of consumers (developer, tester, and so on),
the time for which the platform is in use, and the compute, storage, or network resources
consumed by the platform.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 8
Software as a Service: “The capability provided to the consumer is to use the provider’s
applications running on a cloud infrastructure. The applications are accessible from various client
devices through either a thin client interface, such as a web browser (for example, web-based
email), or a program interface. The consumer does not manage or control the underlying cloud
infrastructure including network, servers, operating systems, storage, or even individual
application capabilities, with the possible exception of limited user-specific application
configuration settings.” – NIST
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 9
A cloud deployment model provides a basis for how cloud infrastructure is built,
managed, and accessed. In SP 800-145, NIST specifies the four primary cloud
deployment models listed below:
• Public cloud
• Private cloud
• Hybrid cloud
• Community cloud
Each cloud deployment model may be used for any of the cloud service models: IaaS, PaaS, and
SaaS. The different deployment models present a number of tradeoffs in terms of control,
scale, cost, and availability of resources.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 10
Public cloud: “The cloud infrastructure is provisioned for open use by the general public. It may
be owned, managed, and operated by a business, academic, or government organization, or some
combination of them. It exists on the premises of the cloud provider.” – NIST
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 11
Private cloud: “The cloud infrastructure is provisioned for exclusive use by a single
organization comprising multiple consumers (for example, business units). It may be
owned, managed, and operated by the organization, a third party, or some combination
of them, and it may exist on or off premises.” – NIST
Many organizations may not wish to adopt public clouds due to concerns related to privacy,
external threats, and lack of control over the IT resources and data. When compared to a public
cloud, a private cloud offers organizations a greater degree of privacy and control over the cloud
infrastructure, applications, and data.
There are two variants of private cloud: on-premise and externally-hosted, as shown in figure 1
and figure 2 respectively on the slide. The on-premise private cloud is deployed by an
organization in its data center within its own premises. In the externally-hosted private
cloud (or off-premise private cloud) model, an organization outsources the implementation of the
private cloud to an external cloud service provider. The cloud infrastructure is hosted on the
premises of the provider and may be shared by multiple tenants. However, the organization’s
private cloud resources are securely separated from other cloud tenants by access policies
implemented by the provider.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 12
Community cloud: “The cloud infrastructure is provisioned for exclusive use by a specific
community of consumers from organizations that have shared concerns (for example, mission,
security requirements, policy, and compliance considerations). It may be owned, managed, and
operated by one or more of the organizations in the community, a third party, or some
combination of them, and it may exist on or off premises.” – NIST
The organizations participating in the community cloud typically share the cost of
deploying the cloud and offering cloud services. This enables them to lower their
individual investments. Since the costs are shared by a fewer consumers than in a public
cloud, this option may be more expensive. However, a community cloud may offer a
higher level of control and protection than a public cloud. As with the private cloud, there
are two variants of a community cloud: on-premise and externally-hosted.
In an on-premise community cloud, one or more organizations provide cloud services that are
consumed by the community. The cloud infrastructure is deployed on the premises of the
organizations providing the cloud services. The organizations consuming the cloud services
connect to the community cloud over a secure network. The figure on the slide illustrates an
example of an on-premise community cloud.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 13
In the externally-hosted community cloud model, the organizations of the community outsource
the implementation of the community cloud to an external cloud service provider. The cloud
infrastructure is hosted on the premises of the provider and not within the premises of any of the
participant organizations. The provider manages the cloud infrastructure and facilitates an
exclusive community cloud environment for the organizations.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 14
Hybrid cloud: “The cloud infrastructure is a composition of two or more distinct cloud
infrastructures (private, community, or public) that remain unique entities, but are
bound by standardized or proprietary technology that enables data and application
portability (for example, cloud bursting for load balancing between clouds.)” – NIST
The figure on the slide illustrates a hybrid cloud that is composed of an on-premise private cloud
deployed by enterprise P, and a public cloud serving enterprise and individual consumers in
addition to enterprise P.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 15
The hybrid cloud has become the model of choice for many organizations. Some use cases of the
hybrid cloud model are described below.
Cloud bursting: Cloud bursting is a common usage scenario of a hybrid cloud. In cloud bursting,
an organization uses a private cloud for normal workloads, but optionally accesses a public cloud
to meet transient higher workload requirements. For example, an application can get additional
resources from a public cloud for a limited time period to handle a transient surge in workload.
Web application hosting: An organization may use the hybrid cloud model for web application
hosting. The organization may host mission-critical applications on a private cloud, while less
critical applications are hosted on a public cloud. By deploying less critical applications in the
public cloud, an organization can leverage the scalability and cost benefits of the public cloud. For
example, e-commerce applications use public-facing web assets outside the firewall and can be
hosted in the public cloud.
Packaged applications: An organization may also migrate standard packaged applications, such
as email and collaboration software out of the private cloud to a public cloud. This frees up
internal IT resources for higher value projects and applications.
Application development and testing: An organization may also use the hybrid cloud model for
application development and testing. An application can be tested for scalability and under heavy
workload using public cloud resources, before incurring the capital expense associated with
deploying it in a production environment. Once the organization establishes a steady-state
workload pattern and the longevity of the application, it may choose to bring the application into
the private cloud environment.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 16
This lesson covered the definition of cloud computing and the essential characteristics of cloud
computing. This lesson also covered cloud service models and cloud deployment models.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 17
This lesson covers the definition of Big Data and its key characteristics. This lesson also covers
the components of a Big Data analytics solution. Further, this lesson covers some use cases of Big
Data analytics.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 18
Big Data represents the information assets whose high volume, high velocity, and high variety
require the use of new technical architectures and analytical methods to gain insights and for
deriving business value.
The definition of Big Data has three principal aspects: characteristics of data, data processing
needs, and business value.
Characteristics of data: Big Data includes data sets of considerable sizes containing both
structured and non-structured digital data. Apart from its size, the data gets generated and
changes rapidly, and also comes from diverse sources. These and other characteristics are
covered next.
Data processing needs: Big Data also exceeds the storage and processing capability of
conventional IT infrastructure and software systems. It not only needs a highly-scalable
architecture for efficient storage, but also requires new and innovative technologies and methods
for processing. These technologies typically make use of platforms such as distributed processing,
massively-parallel processing, and machine learning. The emerging discipline of Data Science
represents the synthesis of several existing disciplines, such as statistics, mathematics, data
visualization, and computer science for Big Data analytics.
Business value: Big Data analytics has tremendous business importance to organizations.
Searching, aggregating, and cross-referencing large data sets in real-time or near-real time
enables gaining valuable insights from the data. This enables better data-driven decision making.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 19
Apart form the characteristics of volume, velocity, and variety—popularly known as “the 3V’s”—,
the three other characteristics of Big Data include variability, veracity, and value.
Volume: The word “Big” in Big Data refers to the massive volumes of data. Organizations are
witnessing an ever-increasing growth in data of all types, such as transaction-based data
stored over the years, sensor data, and unstructured data streaming in from social
media. This growth in data is reaching Petabyte—and even Exabyte—scales. The excessive
volume not only requires substantial cost-effective storage, but also gives rise to challenges in
data analysis.
Velocity: Velocity refers to the rate at which data is produced and changes, and also how fast
the data must be processed to meet business requirements. Today, data is generated at an
exceptional speed, and real-time or near-real time analysis of the data is a challenge for
many organizations. It is essential for the data to be processed and analyzed, and the
results to be delivered in a timely manner. An example of such a requirement is real-time
face recognition for screening passengers at airports.
Variety: Variety refers to the diversity in the formats and types of data. Data is generated by
numerous sources in various structured and non-structured forms. Organizations face the
challenge of managing, merging, and analyzing the different varieties of data in a cost-effective
manner. The combination of data from a variety of data sources and in a variety of formats is a
key requirement in Big Data analytics. An example of such a requirement is combining a large
number of changing records of a particular patient with various published medical research to find
the best treatment.
(Cont’d)
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 20
Variability: Variability refers to the constantly changing meaning of data. For example, analysis
of natural language search and social media posts requires interpretation of complex and highly-
variable grammar. The inconsistency in the meaning of data gives rise to challenges related to
gathering the data and in interpreting its context.
Veracity: Veracity refers to the varying quality and reliability data. The quality of the data being
gathered can differ greatly, and the accuracy of analysis depends on the veracity of the source
data. Establishing trust in Big Data presents a major challenge because as the variety and
number of sources grows, the likelihood of noise and errors in the data increases. Therefore, a
significant effort may go into cleaning data to remove noise and errors, and to produce accurate
data sets before analysis can begin. For example, a retail organization may have gathered
customer behavior data from across systems to analyze product purchase patterns and
to predict purchase intent. The organization would have to clean and transform the data
to make it consistent and reliable.
Value: Value refers to the cost-effectiveness of the Big Data analytics technology used and the
business value derived from it. Many large enterprise scale organizations have maintained large
data repositories, such as data warehouses, managed non-structured data, and carried out real-
time data analytics for many years. With hardware and software becoming more affordable and
the emergence of more providers, Big Data analytics technologies are now available to a much
broader market. Organizations are also gaining the benefits of business process enhancements,
increased revenues, and better decision making.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 21
Data for analytics typically comes from repositories such as enterprise data warehouses and data
lakes.
A data warehouse is a central repository of integrated data gathered from multiple different
sources. It stores current and historical data in a structured format. It is designed for query and
analysis to support an organization’s decision making process. For example, a data warehouse
may contain current and historical sales data that is used for generating trend reports for sales
comparisons.
A data lake is a collection of structured and non-structured data assets that are stored as exact or
near-exact copies of the source formats. The data lake architecture is a “store-everything”
approach to Big Data. Unlike conventional data warehouses, data is not classified when it is stored
in the repository, as the value of the data may not be clear at the outset. The data is also not
arranged as per a specific schema and is stored using an object-based storage architecture. As a
result, data preparation is eliminated and a data lake is less structured compared to a data
warehouse. Data is classified, organized, or analyzed only when it is accessed. When a business
need arises, the data lake is queried, and the resultant subset of data is then analyzed to provide
a solution. The purpose of a data lake is to present an unrefined view of data to highly-skilled
analysts, and to enable them to implement their own data refinement and analysis techniques.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 22
The technology layers in a Big Data analytics solution include storage, MapReduce technologies,
and query technologies. These components are collectively called the “SMAQ stack”.
Storage is the foundational layer of the stack, and is characterized by a distributed architecture
with primarily non-structured content in non-relational form.
The intermediate layer consists of MapReduce technologies that enable the distribution of
computation across multiple compute systems for parallel processing. It also supports a batch-
oriented processing model of data retrieval and computation as opposed to the record-set
orientation of most SQL-based databases.
The query layer typically implements a NoSQL database for storing, retrieving, and processing
data. It also provides a user-friendly platform for analytics and reporting.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 23
MapReduce is the driving force behind most Big Data processing solutions. It is a parallel
programming framework for processing large data sets on a compute cluster. The key innovation
of MapReduce is the ability to take a query over a data set, divide it, and run it in parallel over
multiple compute systems or nodes. This distribution solves the issue of processing data that is
too large to be processed by a single machine.
MapReduce works in two phases—“Map” and “Reduce”—as suggested by its name. An input data
set is split into independent chunks which are distributed to multiple compute systems. The Map
function processes the chunks in a completely parallel manner, and transforms them into multiple
smaller intermediate data sets. The Reduce function condenses the intermediate results and
reduces them to a summarized data set, which is the desired end result. Typically both the input
and the output data sets are stored on a file-system. The MapReduce framework is highly scalable
and supports the addition of processing nodes to process chunks. Apache’s Hadoop MapReduce is
the predominant open source Java-based implementation of MapReduce.
The figure on the slide depicts a generic representation of how MapReduce works and can be used
to illustrate various examples. A classic example of MapReduce is the task of counting the number
of unique words in a very large body of data including millions of documents. In the Map phase,
each word is identified and given the count of 1. In the Reduce phase, the counts are added
together for each word. Another example is the task of grouping customer records within a data
set into multiple age groups, such as 20-30, 30-40, 40-50, and so on. In the Map phase, the
records are split and processed in parallel to generate intermediate groups of records. In the
Reduce phase, the intermediate data sets are summarized to obtain the distinct groups of
customer records (depicted by the colored groups).
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 24
MapReduce fetches data sets and stores the results of the computation in storage. The data
must be available in a distributed fashion, to serve each processing node. The design and
features of the storage layer are important not just because of the interface with MapReduce, but
also because they affect the ease with which data can be loaded and the results of computation
extracted and searched.
The distributed file systems like HDFS typically provide only an interface similar to that of
regular file systems. Unlike a database, they can only store and retrieve data and not
index it, which is essential for fast data retrieval. To mitigate this and gain the
advantages of a database system, SMAQ solutions may implement a NoSQL database on
top of the distributed file system. NoSQL databases may have built-in MapReduce
features that allow processing to be parallelized over their data stores. In many
applications, the primary source of data is in a relational database. Therefore, SMAQ
solutions may also support the interfacing of MapReduce with relational database
systems.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 25
It is unintuitive and inconvenient to specify MapReduce jobs in terms of distinct Map and
Reduce functions in a programming language. To mitigate this, SMAQ systems
incorporate a higher-level query layer to simplify both the specification of the MapReduce
operations, and the analysis of the results. The query layer implements high-level
languages that enable users to describe, run, and monitor MapReduce jobs. The
languages are designed to handle not only the processing, but also the loading and
saving of data from and to the MapReduce cluster. The languages typically support
integration with NoSQL databases implemented on the MapReduce cluster.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 26
Big Data analytics solutions have created a world of new opportunities for organizations, such as
healthcare, finance, retail, and governments.
Finance: In finance, organizations use Big Data analytics for activities such as correlating
purchase history, profiling customers, and analyzing behavior on social networks. This also
enables in controlling customer acquisition costs and target sales promotions more effectively. Big
Data analytics is also being used extensively in detecting credit card frauds.
Retail: In retail, organizations use Big Data analytics to gain valuable insights for competitive
pricing, anticipating future demand, effective marketing campaigns, optimized inventory
assortment, and improved distribution. This enables them to provide optimal prices and services
to customers, and also improve operations and revenue.
Government: In government organizations, Big data analytics enables improved efficiency and
effectiveness across a variety of domains such as social services, education, defense, national
security, crime prevention, transportation, tax compliance, and revenue management.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 27
This lesson covered the definition of Big Data and its key characteristics. This lesson also covered
the components of a Big Data analytics solution, namely storage, MapReduce, and query. Further,
this lesson covered some of the use cases of Big Data analytics.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 28
This lesson covers social networking, social network analysis, and social network use cases. This
lesson also covers mobile computing and its use cases.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 29
Social networking is the practice of individuals establishing connections with other individuals for
expanding social and/or business contacts. It results in the formation of a structure of many-to-
many human connections called a social network, which represents the relationships and flows
between individuals and groups. A social network enables the sharing of information with the
entire network or subsets of it.
A variety of online services provide a global web-based platform to build social networks among
individuals (and organizations) who share interests, activities, and real-life connections. Online
social networking has grown immensely over the past decade with the proliferation of the Internet
and mobile devices. These social networking services enable the creation, discovery, sharing,
promotion, distribution, and consumption of a variety of digital content for community and social
activities across geographic locations. Most social networking services enable individuals and
organizations to create their personal profiles and connect to each other. They also typically
enable the sharing of opinions, activities, blogs, events, messages, pictures, videos, and other
media. Some provide a specialized set of features, such as enabling connections with co-workers
within an organization, professionals of different fields, or with potential future employers. Some
of the most popular online social networking services are Facebook, Twitter, LinkedIn, Pinterest,
Instagram, and Google+.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 30
The increasing use of online social networking services has led to a massive growth of data in the
digital universe. The immense volumes of data hold a tremendous value for organizations.
Through Big Data analytics, organizations can gain valuable insights from the data generated
through social networking. Social network analysis (SNA) is the process of analyzing patterns of
relationships in social networks. SNA involves collecting data from multiple sources (such as social
media posts, surveys, e-mails, blogs, and other electronic artifacts), using analytics on the data
to identify relationships, and mining it for new information. It is useful for examining the social
structure, information flow, and interdependencies (or work patterns) of individuals or
organizations. SNA tools scan social media to determine the quality or effectiveness of a
relationship, identify influential people, associations, and trends.
SNA enables the identification and discovery of complex dynamics, growth, and evolution patterns
in social networks using machine learning and data mining approaches. SNA uses a
multidisciplinary approach involving the use of a wide range of techniques from social sciences,
mathematics, statistics, physics, network science, and computer science. SNA enables the
discovery and analysis of communities, personalization for solitary activities (for example, search)
and social activities (for example, discovery of potential friends), the analysis of user behavior in
open forums (for example, conventional sites, blogs, and communities) and in commercial
platforms (for example, e-commerce). SNA has a wide range of application including
engineering, science, economics, national security, criminology, fraud detection, and e-
commerce.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 31
Apart from providing a platform for connecting people and organizations, online social networking
also has many applications in many areas such as education, science, problem-solving, sales, and
marketing. Some use cases of social networking are described below:
Brand networking: It is the use of social networking to provide consumers with a platform of
relevant content associated with a particular brand. Organizations use brand networking by
creating social network pages and communities that showcase products, provide information on
promotional offers and events, and enable customer interaction. Brand networking provides a
higher-level of customer interaction and participation, gives global visibility to brands, and
enables reaching a broader customer base. By actively engaging in social networking,
organizations also seek to improve their visibility on search engines. Through analytics tools,
organizations can also gain insights on their customer base that helps in creating more effective
sales campaigns.
Marketing: The use of social networks is becoming a standard approach for marketing. Social
media marketing has the potential to help increase sales and revenue. Organizations advertise
their products and services on the pages of individuals with the advertisements either linking back
to the organization’s social media page or to their sales website. Organizations may also mine
social content for identifying potential customers. This helps them in finding new target audiences
more effectively for marketing.
Customer support: Organizations are also increasingly using social networks to engage
customers for enhanced and faster support. By monitoring customer comments on social media,
organizations proactively identify and resolve customer issues.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 32
A mobile device is a handheld compute system that has a display with either touch input and/or a
hardware keyboard. Mobile devices typically have features such as voice calling, Bluetooth and/or
NFC for file sharing, Wi-Fi and/or data services (for example, HSPA+ and LTE) for Internet
connectivity, GPS, and audio-video capabilities. Examples of mobile devices are laptops, tablets,
smartphones, and personal digital assistants (PDAs). Mobile computing is the use of mobile
devices to access applications and/or information “on the go” over a wireless network.
The convergence of wireless technologies, advanced electronics, and the Internet has led to the
emergence of pervasive computing (also called ubiquitous computing) and the Internet of Things
(IoT). Pervasive computing is the growing trend of embedding processors in devices such as
sensors and wearable gadgets and enabling them to communicate over the Internet. Pervasive
computing devices are continuously connected and available, and are contributing to the growth
of the mobile computing ecosystem.
The figure on the slide depicts an application server in an enterprise data center/cloud being
accessed by various mobile clients through wireless connections over the Internet or a private
network, such as a wireless LAN (WLAN).
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 33
Mobile computing has vast applications in numerous industries and domains. Some use cases of
mobile computing are described below.
Enterprise mobility: The rapid adoption of smart mobile devices is changing the way individuals
and organizations interact and collaborate. Organizations are increasingly providing their
workforce with ubiquitous access to information and business applications over mobile devices.
This enables the employees to stay informed and to carry out business operations irrespective of
their location. This increases the collaboration and enhances the workforce productivity.
Organizations are also increasingly exploring the option of Bring Your Own Device (BYOD),
whereby employees are allowed to use non-company devices, such as laptops and tablets as
business machines. BYOD enables employees to have access to applications and information from
their personal devices while on the move. It also creates an opportunity to reduce acquisition and
operational costs.
Mobility-based products and services: Organizations and service providers offer customers a
wide range of mobility-based applications. It facilitates ubiquitous availability of software products
and services to customers, improves customer service, increases market penetration, and leads
to a potential increase in profitability. A wide variety of mobility-based solutions, such as social
networking services, mobile banking, mobile e-commerce, location-based services, cloud storage,
mobile ticketing, and mobile marketing are extensively available globally.
Mobile cloud computing: Mobile cloud computing is the convergence of cloud computing,
Internet, and wireless technologies. With the rapid growth in the use of mobile devices, cloud
service providers are increasingly enabling mobile access to cloud services. For example, today’s
SaaS cloud providers offer a variety of mobile applications for cloud storage, travel and expense
management, and customer relationship management. Mobile cloud computing is also prevalent
within organizations, with corporate IT making enterprise cloud services available to a mobile
workforce.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 34
This lesson covered social networking, social network analysis, and social network use cases. This
lesson also covered mobile computing and its use cases.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 35
This lesson covers the key drivers for transforming to the third platform, characteristics of third
platform infrastructure, and business and IT imperatives for third platform transformation.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 36
For organizations worldwide, there are several drivers for transforming to the third platform.
Some key drivers are described below.
Agility and innovation: In today’s competitive world, organizations seek to have agile
operations and reduce the time-to-market for products and services. Third platform technologies
enable organizations to operate in a more agile manner and facilitate innovation. For example,
instead of following the traditional process of resource acquisition, an application development
team in an organization can provision computing resources from a cloud’s self-service portal, as
and when required. This agility enables rapid development, reduces the time-to-market, and
facilitates innovation and experimentation, which is essential for the development of new products
and services.
Intelligent operations: Organizations globally depend on the smart combination of people and
technology for efficient operations. Inefficient processes, poor quality data, and ineffective
communication and collaboration among asset teams can severely hinder operational efficiency.
The new possibilities to increase operational effectiveness and efficiency are constantly emerging
through the use of third platform technologies. For example, analytics enables organizations to
develop efficient and cost-effective equipment maintenance and replacement strategies.
Equipment can be monitored and analytics tools can process the data in real time to spot or
predict device failures. This also reduces downtime due to equipment failures.
(Cont’d)
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 37
New products and services: Creating new products and services is an essential process in
organizations for sustaining business growth and profitability. The third platform enables
organizations to create new or additional products and services on top of traditional products. This
allows them to monetize the new offerings and thereby create new revenue streams. For
instance, manufacturers are using data obtained from sensors embedded in products to offer
innovative after-sales service to customers, such as proactive maintenance to avoid failures in the
products. Analytics also allows organizations to have a more precise segmentation of their
customers and offer tailored products or services.
Mobility: Today’s workforce and customers have ubiquitous access to information and business
applications over mobile devices. This increases workforce collaboration and productivity, and
potentially increases market penetration and profitability.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 38
Module 1, ‘Introduction to Information Storage’ described the key characteristics of a data center.
Although a third platform infrastructure has similar key characteristics, there are additional
requirements from the infrastructure to support capabilities such as mobility, social interaction,
analytics, and delivering IT resources as services in a cost-effective manner. Some key
characteristics of a third platform infrastructure are described below.
Security: With third platform technologies, there are several security challenges such as
unauthorized data access, data loss, hacking, malware, data ownership, and loss of governance
and compliance. Several security mechanisms such as authentication, access control, firewall, and
encryption are implemented to ensure security across multiple third platform technologies. These
and other techniques are covered later in the course. Security tools may also support threat
detection, security incident response, compliance reporting and incident investigation through the
real-time collection and historical analysis of security events from a wide variety of event and
contextual data sources.
(Cont’d)
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 39
Performance: A third platform infrastructure encounters mixed workloads, that may have
varying combinations of sequential/random reads and/or writes for different operations such as
transaction processing, analytics, and backup and recovery. Some applications such as Big Data
analytics solutions use batch processing and require real-time or near-real time processing
capabilities. The infrastructure should maintain optimal performance of applications, while
ensuring high throughput and low latency. Apart from installing high-performance components, a
number of techniques such as load balancing, caching, and storage tiering are used to ensure the
performance required to meet service levels. These and other techniques are covered later in the
course.
Ease of access: One of the key drivers for third platform adoption is the ability to access
applications and information from any location over mobile devices. Organizations require
infrastructure, software, and application development platforms to enable mobile access
to information and current and new applications.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 40
Today, CIOs and line-of-business executives in organizations find themselves in the midst of
unprecedented opportunity that comes with the emergence of the third platform. The third
platform is fueling enterprise innovation and growth, and major new sources of competitive
advantage are being built by creatively leveraging third platform technologies. For third platform
adoption, organizations need to transform the way in which they engage with their workforce and
customers, the speed at which they deliver their products and services, and the efficiency and
resiliency of their operations. Organizations need to provide support for the proliferation of new
devices coming into the workplace, meet the demands of a highly mobile workforce, manage
rapidly expanding data volumes, and ensure the value and security of information in multiple data
sources both within and outside the enterprise. All of these are transforming the traditional IT
environment and changing the way IT roles are performed.
Some key organizational imperatives for third platform transformation are described below.
(Cont’d)
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 41
Organizational transformation: Organizational transformation focuses on how the IT
organization and the roles within it change when transforming to the third platform. New
roles and responsibilities emerge to establish and manage services instead of technology.
The new roles involve performing tasks related to service definition and creation, service
administration and management, service governance and policy formulation, and service
consumer management. For example, IT may move from a cost center to a strategic
business partner within the organization and IT managers could be called on to act as
cloud advisors or financial managers of the IT services business. Some other examples of
new roles in a third platform environment include service manager, account manager,
cloud architect, capacity planner, and service operation manager.
Skills transformation: The changing roles for IT staff entails the need for skills in new
technologies, and also in more business-facing skills focused on communications, marketing, and
service management. Many organizations were built around second platform technologies, and
implementing innovative third platform solutions require newer skills and expertise. The skills in
the areas of cloud, Big Data, social, and mobile technologies are predicted by IDC to become the
new core IT competencies over the next two decades. Apart from technical skills, having strong
soft skills like communication, collaboration, networking, creativity, relationship building, and
problem solving is considered equally important. Organizations may also take the hybrid approach
of adding specific skills to their in-house teams to focus on their core competencies, while
sourcing non-core activities from partners, suppliers, and third party service providers.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 42
This lesson covered the key drivers for transforming to the third platform, characteristics of third
platform infrastructure, and the imperatives for third platform transformation.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 43
The Concepts in Practice section covers VMware vCloud Air, Pivotal Cloud Foundry, EMC
Syncplicity, Pivotal GemFire, and Pivotal Greenplum Database.
Note:
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 44
VMware vCloud Air is a secure public cloud owned and operated by VMware that offers
Infrastructure as a Service for enterprise use cases, such as extending existing data center
workloads into the public cloud, migrating applications from on-premise clouds to the public
cloud, new application development, and disaster recovery. It is built on the foundation of
vSphere and is compatible with existing VMware on-premise clouds. It enables organizations to
adopt the hybrid cloud model by seamlessly extending their on-premise clouds into the public
cloud. vCloud Air allows existing applications to run in the public cloud without the need to rewrite
or re-architect them. Organizations can use the same networking, security, and management
tools, skills, and policies that are used in their on-site environments. A consolidated view of
allocated resources is provided to enable administrators to manage resource utilization. vCloud Air
has three primary service offerings (with more expected in the future): Dedicated Cloud (single-
tenant, physically isolated cloud service), Virtual Private Cloud (logically isolated, multi-tenant
cloud service), and Disaster Recovery (cloud-based disaster recovery service). vCloud Air offers
both term-based subscription and pay-as-you-go options.
Pivotal Cloud Foundry (CF) is an enterprise Platform as a Service, built on the foundation of the
Cloud Foundry open-source PaaS project. The Cloud Foundry open-source project is sustained by
the Cloud Foundry Foundation, which has many leading global enterprises as members. Pivotal
CF, powered by Cloud Foundry, enables streamlined application development, deployment, and
operations in both private and public clouds. It supports multiple programming languages and
frameworks including Java, Ruby, Node.js, PHP, and Python. It supports agile application
development and enables developers to continuously deliver updates to and horizontally scale
web and third platform applications with no downtime. Developers can rapidly develop and deploy
applications without being concerned about configuring and managing the underlying cloud
infrastructure. Pivotal CF also supports multiple leading data services such as Jenkins, MongoDB,
MySQL, Redis, and Hadoop. The use of open standards enables migration of applications between
compatible public and private clouds. Pivotal CF provides a unified management console for the
entire platform that enables in-depth application and infrastructure monitoring.
(Cont’d)
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 45
EMC Syncplicity is an enterprise-grade online file sharing, collaboration, and data protection
SaaS solution. It enables a business user to securely share files and folders, and collaborate with
other users. It supports both mobile and web access to files from any device, and the files are
also available offline. It synchronizes file changes across all devices in real time, so
documents are always protected and available on any device. If a device fails, access to
files would still be available from other devices. It enables a bring-your-own-device (BYOD)
workforce, while providing access controls, single sign-on (SSO), data encryption, and other
enterprise-grade features. Syncplicity currently has four offerings: Personal Edition (for
individuals), Business Edition (for small and medium businesses), Department Edition
(for enterprise departments), and Enterprise Edition. The Enterprise Edition has support
for public, on-premise, and hybrid deployment options.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 46
Pivotal GemFire is an in-memory distributed database for high-scale custom NoSQL applications.
GemFire stores all operational data in the RAM across distributed nodes to provide fast access to
data while minimizing the performance penalty of reading from the storage drives. This provides
low latency data access to applications at massive scale with many concurrent transactions
involving Terabytes of operational data. Designed for maintaining consistency of concurrent
operations across its distributed data nodes, GemFire supports ACID (Atomicity, Consistency,
Isolation, Durability) transactions for massively-scaled applications, such as stock trading,
financial payments, and ticket sales having millions of transactions a day. GemFire provides linear
scalability that allows to predictably increase the capacity and the data storage by adding
additional nodes to a cluster. Data distribution and system resource usage is automatically
adjusted as nodes are added or removed, making it easy to scale up or down to quickly meet the
expected or unexpected spikes of demand. GemFire offers built in fail-over and resilient self-
healing clusters to allow developers to meet the most stringent service level requirements for
data accessibility. It provides native support for Java, C++, and C# programming languages,
while applications written in other programming languages are supported via a REST API.
Pivotal Greenplum Database is a complete SMAQ solution, designed for business intelligence
and Big Data analytics. It has a linearly scalable, massively parallel processing (MPP) architecture
that stores and analyzes Terabytes to Petabytes of data. In this architecture, each server node
acts as a self-contained database management system that owns and manages a distinct portion
of the overall data. It provides automatic parallelization with no need for manual partitioning or
tuning. The system automatically distributes data and parallelizes query workloads across all
available hardware. In-database analytics is enabled via the support of high-performance and
flexible data exchange between Hadoop and Greenplum Database. It has embedded support for
SQL, MapReduce, and programmable analytics. It also provides tools for database management,
backup, and disaster recovery.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 47
This module covered cloud computing and its essential characteristics. This module also covered
cloud service models, and cloud deployment models. Additionally, this module covered Big Data
analytics. Further, this module covered social networking and mobile computing. Lastly, this
module covered the key characteristics of third platform infrastructure and the key imperatives
for transforming to the third platform.
Copyright 2015 EMC Corporation. All rights reserved. Module 2: Third Platform Technologies 48