0% found this document useful (0 votes)
81 views

Iiot Notes

iiot notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Iiot Notes

iiot notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

L T P C

JIT1038 INDUSTRIAL IoT 4.0


3 0 0 3

COURSE OBJECTIVES:
 To develop knowledge in Industrial Internet of Things (IIoT) fundamentals.
 To understand the architecture, IOT and its protocols
 To Understand the various data analytics techniques
 To Understand the CPS for Industry 4.0
 To provide students with a good depth of knowledge of Designing Industrial IOT Systems for
various applications
UNIT-I : Industrial IOT Introduction 9
Introduction to IOT, IOT Vs. IIOT, History of IIOT, Components of IIOT - Sensors and
Actuators for Industrial Processes, Role of IIOT in Manufacturing Processes. Challenges & Benefits in
implementing IIOT.
UNIT-II : IIoT Architecture 9
Industrial IoT: Business Model and Reference Architecture: IIoT-Business Models, Industrial IoT-
Layers: IIoT Sensing, IIoT Processing, IIoT Communication, IIoT Networking
UNIT-III : IIOT ANALYTICS 9
Big Data Analytics and Software Defined Networks, Machine Learning and Data Science, Julia
Programming, Data Management with Hadoop.
UNIT-IV : Industrial IoT: CYBER PHYSICAL SYSTEM 9
Introduction to Cyber Physical Systems (CPS), Architecture of CPS- Components, Data
science and technology for CPS, Emerging applications in CPS in different fields. Case study:
Application of CPS in health care domain.
UNIT-V : Industrial IoT- Application Domains 9
Industrial IoT- Application Domains: Healthcare, Power Plants, Inventory Management &
Quality Control, Plant Safety and Security (Including AR and VR safety applications), Facility
Management.
TOTAL: 45 HOURS

COURSE OUTCOMES:
At end of the course students will be able to:
CO1 :To understand the basics of industrial IoT (IIoT).
CO2 :To develop various applications using IIOT architectures
CO3 : Recognize the uses of cloud computing and data analytics
CO4 :Analyze privacy and security measures for industry standard solutions
CO5 :Design and implement IOT applications that manage various technology

TEXT BOOKS:
1. Veneri, Giacomo, and Antonio Capasso- Hands-on Industrial Internet of Things: Create a
Powerful Industrial IoT Infrastructure Using Industry 4.0, 1stEd., Packt Publishing Ltd,2018
2. Alasdair Gilchrist- Industry 4.0: The Industrial Internet of Things, 1st Ed., Apress, 2017

REFERENCES:
1. Alasdair Gilchrist, Industry 4.0: The Industrial Internet of Things, 1st Edition, Apress, 2017
2. Aboul Ella Hassanien, Nilanjan Dey and Sureaka Boara, Medical Big Data and Internet of
Medical Things: Advances, Challenges and Applications, 1st edition, CRC Press, 2019.

WEBSITE REFERENCES :
1. https://onlinecourses.nptel.ac.in/noc22_cs52/preview
2. https://www.coursera.org/specializations/developing-industrial-iot#courses
3. https://www.coursera.org/learn/industrial-inte rnet-of-things.
4. https://www.coursera.org/learn/inte rnet-of-things-sensing-actuation

CO-PO AND CO-PSO MAPPING:


CO/
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2
PO,PSO
CO1 2
CO2 3
CO3 3 3
CO4 3
CO5 3
AVG 3 3 3
UNIT-I Industrial IOT Introduction
Introduction to IOT, IOT Vs. IIOT, History of IIOT, Components of IIOT - Sensors and
Actuators for Industrial Processes, Role of IIOT in Manufacturing Process es. Challenges &
Benefits in implementing IIOT.
1. Introduction to IOT

Internet of things (IoT) The Internet of things (IoT) is the inter-networking of physical
devices, vehicles (also referred to as “connected devices” and “smart devices”), buildings,
and other items embedded with electronics, software, sensors, actuators, and network
connectivity which enable these objects to collect and exchange data.

1.1Characteristics:

Things-related services: The IoT is capable of providing thing-related services within


the constraints of things, such as privacy protection and semantic consistency
between physical things and their associated virtual things

Connectivity: Things in I.O.T. should be connected to the infrastructure, without


connection nothing makes sense.

Intelligence: Extraction of knowledge from the generated data is important,


sensor generate data and this data and this data should be interpreted properly.

Scalability: The no. of things getting connected to the I.O.T. infrastructure is


increased day by day. Hence, an IOT setup shall be able to handle the massive
expansion.

Unique Identity: Each IOT device has an I.P. address. This identity is helpful in
tracking the equipment and at times to query its status.

Dynamic and Self-Adapting: The IOT device must dynamically adopt itself to the
changing context. Assume a camera meant for surveillance, it may have to work in
different conditions and at different light situations (morning, afternoon, night).

Heterogeneity: The devices in the IoT are heterogeneous as based on different


hardware platforms and networks. They can interact with other devices different
networks.

Safety: Having got all the things connected with the Internet possess a major threat,
as our personal data is also there and it can be tampered with, if proper safety
measures are not taken.

1.2Application areas of IoT:


Smart Home: The smart home is one of the most popular applications of IoT. The
cost of owning a house is the biggest expense in a homeowner’s life. Smart homes
are promised to save the time , money and energy.

Smart cities: The smart city is another powerful application of IoT. It includes
smart surveillance, environment monitoring, automated transformation, urban
security, smart traffic management, water distribution, smart healthcare etc.

W earables: Wearables are devices that have sensors and software installed which
can collect data about the user which can be later used to get the insights about
the user. They must be energy efficient and small sized.

Connected cars: A connected car is able to optimize its own operation,


maintenance as well as passenger’s comfort using sensors and internet
connectivity.

Smart retail: Retailers can enhance the in-store experience of the customers
using IoT. The shopkeeper can also know which items are frequently bought
together using IoT devices.

Smart healthcare: People can wear the IoT devices which will collect data about
user's health. This will help users to analyze themselves and follow tailor -made
techniques to combat illness. The doctor also doesn't have to visit the patients in
order to treat them.

1.3 IoT Categories

IOT can be classified into two categorie s:

1.Consumer IoT(CIOT): The Consumer IoT refers to the billions of physical


personal devices, such as smartphones, wearables, fashion items and the growing
number of smart home appliances, that are now connected to the internet,
collecting and sharing data.
A Consumer IoT network typically entails few consumer devices, each of which
has a limited lifetime of several years.

The common connectivity used in this kind of solutions are Bluetooth, WiFi, and
ZigBee. These technologies offer short-range communication, suitable for
applications deployed in limited spaces such as houses, or small offices.

2.industrial internet of things (IIoT): It refers to interconnected sensors,


instruments, and other devices networked together with computers' industrial
applications, including manufacturing and energy management. This connectivity
allows for data collection, exchange, and analysis, potentially facilitating
improvements in productivity and efficiency as well as other economic ben.
2. IOT Vs. IIOT

The major differences between IIoT and IoT are as follows –

IIOT IOT

It is described as using It is described as the physical devices like mobiles,


the internet of things in Pc’s, home appliances and many more electronic
industrial applications devices that are embedded with sensors, software’s
and sectors. and other technologies to transmit the data and to
communicate among the devices through the Internet.

Examples: amazon Examples: air conditioners, sensors, smart watches,


warehouse, smart mobile phones etc
robotics, Air bus etc.

IIoT deals with large IoT deals with small scale network
scale networks

Offers remote on site Offers easy off site programming


programming

To protect the data it IoT requires identity and privacy


requires robust security

Long life cycle Short product life cycle

High reliable Less reliable

3. History of IIOT:

• Industry 1.0 (1784) – The invention of steam engines kick started the Industry
1.0. However, the manufacturing was purely labor oriented and tiresome.

• Industry 2.0 (1870)- The first assembly line production was introduced. This
invention was a big relief for the workers as their labor was minimized to the
possible extent. Henry Ford the Father of mass production and the assembly
line introduced the process in a car manufacturing plant by Ford to improve
the productivity using conveyor belt mechanism.

• Industry 3.0 (1969)- Involved advancement of electronic technology and


industrial robotics. Miniaturization of the circuit boards through
programmable logic controllers, Industrial robotics to simplify, automate and
increase the production. However, the operations still remained isolated from
the entire enterprise.
• Industry 4.0 (2010) – The vision of connected enterprise through
interconnecting industrial assets through the internet was fulfilled with the
introduction of Industry 4.0. The smart devices communicate with each other
and create valuable insights. IIoT brought with it the advantages of asset
optimization, production integration, smart monitoring, remote diagnosis,
intelligent decision making and most importantly the feature of Predictive
Maintenance.

4. Components of IIOT

SENSOR

Sensor is a device used for the conversion of physical events or


characteristics into the electrical signals. This is a hardware device that
takes the input from environment and gives to the system by converting it.

For example, a thermometer takes the temperature as physical characteristic


and then converts it into electrical signals for the system.

Characteristics of Sensors

1. Range: It is the minimum and maximum value of physical variable that


the sensor can sense or measure. For example, a Resistance Temperature
Detector (RTD) for the measurement of temperature has a range of -200 to
800oC.
2. Span: It is the difference between the maximum and minimum values of
input. In above example, the span of RTD is 800 – (-200) = 1000oC.
3. Accuracy: The error in measurement is specified in terms of accuracy. It is
defined as the
difference between measured value and true value. It is defined in terms
of % of full scale or % of reading.

4. Precision: It is defined as the closeness among a set of values. It is different


from accuracy.
5.Linearity: Linearity is the maximum deviation between the measured
values of a sensor from ideal curve.
6.Hysteresis:It is the difference in output when input is varied in
two ways- increasing and decreasing.
7. Resolution: It is the minimum change in input that can be sensed by the
sensor.
8. Reproducibility: It is defined as the ability of sensor to produce the same
output when same input is applied.

9.Repeatability: It is defined as the ability of sensor to produce the same


output every time when the same input is applied and all the physical and
measurement conditions kept the same including the operator, instrument,
ambient conditions etc.
10. Response Time: It is generally expressed as the time at which the
output reaches a certain percentage (for instance, 95%) of its final value,
in response to a step change of the input.

Classification of sensors:

Sensors based on the power requirement sensor is classified into two types:
Active Sensors, Passive Sensors.

Active Sensors: Does not need any external energy source but directly
generates an electric signal in response to the external.

Example: Thermocouple, Photodiode, Piezoelectric sensor.

Passive Sensors: The sensors require external power called excitation


signal. Sensors modify the excitation signal to provide output.

Example: Strain gauge.

Sensors based on output sensor is classified into two types: Analog


Sensors, Digital Sensors.

Analog Sensors

 Analog Sensors produces a continuous output signal or


voltage which is generally proportional to the quantity being
measured.
 Physical quantities such as Temperature, speed, Pressure,
Displacement, Strain etc. are all analog quantities as they tend to be
continuous in nature.
 For example, the temperature of a liquid can be measured using a
thermometer or thermocouple (e.g. in geysers) which continuously
responds to temperature changes as the liquid is heated up or cooled
down.
Digital Sensors

 Digital Sensors produce discrete output voltages that are a digital


representation of the quantity being measured.
 Digital sensors produce a binary output signal in the form of a logic
"1" or a logic "0" , ("ON" or "OFF).
 Digital signal only produces discrete (non-continuous) values,
which may be output as a signal "bit" (serial transmission), or by
combing the bits to produce a signal "byte" output (parallel
transmission).

Based on type of data measured sensor is classified into two types:


Scalar Sensors and Vector Sensors.
Scalar Sensors

 Scalar Sensors produce output signal or voltage which


generally proportional to the magnitude of the quantity being
measured.
 Physical quantities such as temperature, color, pressure, strain, etc.
are all scalar quantities as only their magnitude is sufficient to convey
an information.
 For example the temperature of a room can be measured using
thermometer or thermocouple, which responds to temperature
changes irrespective of the orientation of the sensor or its direction.
Vector Sensors

 Vector Sensors produce output signal or voltage which


generally proportional to the magnitude, direction, as well as
the orientation of the quantity being measured.
 Physical quantities such as sound, image, velocity, acceleration,
orientation, etc. are all vector quantities, as only their magnitude is not
sufficient to convey the complete information.
 For example, the acceleration of a body can be measured using an
accelerometer, which gives the components of acceleration of the
body with respect to the x,y,z coordinate axes.
ACTUATOR

Actuator is a device that converts the electrical signals into the physical
events or characteristics. It takes the input from the system and gives
output to the environment. For example, motors and heaters are some of
the commonly used actuators.

Types of Actuators

1. Hydraulic Actuators: Hydraulic actuators operate by the use of a fluid-


filled cylinder with a piston suspended at the centre. Commonly, hydraulic
actuators produce linear movements, and a spring is attached to one end as
a part of the return motion. These actuators are widely seen in exercise
equipment such as steppers or car transport carriers.
2.Pneumatic Actuators: Pneumatic actuators are one of the most reliable
options for machine motion. They use pressurized gases to create
mechanical movement. Many companies prefer pneumatic-powered
actuators because they can make very precise motions, especially when
starting and stopping a machine. Examples of equipment that uses
pneumatic actuators include: Bus brakes, Exercise machines, Vane motors,
Pressure sensors
3.Electric Actuators : Electrical actuators, as you may have guessed,
require electricity to work. Well-known examples include electric cars,
manufacturing machinery, and robotics equipment. Similar to pneumatic
actuators, they also create precise motion as the flow of electrical power
is constant.
4.Thermal and M agnetic Actuators : Thermal and magnetic actuators
usually consist of shape memory alloys that can be heated to produce
movement. The motion of thermal or magnetic actuators often comes from
the Joule effect, but it can also occur when a coil is placed in a static
magnetic field. The magnetic field causes constant motion called the
Laplace-Lorentz force. Most thermal and magnetic actuators can produce
a wide and powerful range of motion while remaining lightweight.
5.M echanical Actuators : Some actuators are mostly mechanical, such as
pulleys or rack and pinion systems. Another mechanical force is applied,
such as pulling or pushing, and the actuator will leverage that single
movement to produce the desired results. For instance, turning a single gear
on a set of rack and pinions can mobilize an object from point A to point B.
The tugging movement applied on the pulley can bring the other side
upwards or towards the desired location.
6.Soft Actuators: Soft actuators (e.g. polymer based) are designed to handle
fragile objects like fruit harvesting in agriculture or manipulating the internal
organs in biomedicine.
They typically address challenging tasks in robotics. Soft actuators produce
flexible motion due to the integration of microscopic changes at the molecular
level into a macroscopic deformation of the actuator materials.

IOT COMPONENTS

Four fundamental components of IoT system, which tells us how IoT works.

i. Sensors/Devices

First, sensors or devices help in collecting very minute data from the
surrounding environment. All of this collected data can have various degrees
of complexities ranging from a simple temperature monitoring sensor or a
complex full video feed.
A device can have multiple sensors that can bundle together to do more
than just sense things. For example, our phone is a device that has
multiple sensors such as GPS, accelerometer, camera but our phone does
not simply sense things.

ii. Connectivity

Next, that collected data is sent to a cloud infrastructure but it needs a medium
for transport.

The sensors can be connected to the cloud through various mediums of


communication and transports such as cellular networks, satellite networks,
Wi-Fi, Bluetooth, wide-area networks (WAN), low power wide area network
and many more.

iii. Data Processing


Once the data is collected and it gets to the cloud, the software performs
processing on the acquired data.

This can range from something very simple, such as checking that the
temperature reading on devices such as AC or heaters is within an
acceptable range. It can sometimes also be very complex, such as identifying
objects (such as intruders in your house) using computer vision on video.

iv. User Interface


Next, the information made available to the end-user in some way. This
can achieve by triggering alarms on their phones or notifying through
texts or emails.

Also, a user sometimes might also have an interface through which they can
actively check in on their IOT system. For example, a user has a camera
installed in his house, he might want to check the video recordings and all
the feeds through a web server.
Role of IIOT in Manufacturing Processes

1. Asset M anagement
IoT technology enables Asset Management, which simply means monitoring pieces
of equipment for better production and quality control and worker’s safety.

As per a report by IBM, industries can achieve a 20% higher product count on
average by optimizing their manufacturing process from their existing line, which is
a huge number. So, a faster and efficient manufacturing plant reduces product
cycle time. And one of the best examples to quote is of the motorbike manufacturing
company Harley Davidson: Via leveraging the power of IIoT, the company is able to
produce a motorbike in just 6 hours! which earlier used to be around 21 days.
IIoT also helps to ensure a safer workplace, especially in hazardous workplaces or
chemical/oil manufacturing firms. For example, with the help of sensors and IoT,
now you can easily detect gas leakages into the pipe network, eliminating that
manual effort. And not only this, IIoT tech along with wearable devices can help in
monitoring the health status of your workers.

2. Real-time inventory tracking


Manufacturing doesn’t stop at manufacturing. But it extends way bey ond that and
extends until the product reaches the customer and satisfies it. And in this
customer satisfaction journey, warehouse and inventory management plays an
important role.

So if we integrate the IoT solutions with the transport management system, this will
provide us with a better status or visibility of the moving vehicles. This helps in the
on-time maintenance of vehicles and swift action in case of road accidents, all of
which ultimately results in fast, efficient, and safe transportation.

3. Predictive M aintenance
Maintenance is a tough task, but it won’t be much with the IoT-enabled
maintenance known as Predictive Maintenance.

Predictive Maintenance or PdM is the technology where rotating machines are


monitored with the help of sensors and the health status of the assent is presented
over the cloud platform. All this becomes possible due to the boom and boon of IoT.

Recognize the fact that industries are literally struggling with the burden of
maintenance. And in numerical terms, it’s costing them around 50 billion dollars
per year. But with PdM as a solution at their hand, they can avoid this lofty
available cost to a very large extent. An example of this is: Rio Tinto which is a
mining company and they are able to save 2 million USD daily using the IoT-
enabled PdM.

4. Smart Pumping
Now, it’s time to make our pumping systems smart. With the help of an IoT-based
system comprising of sensors and switches, you can not only monitor but also
regulate the flow, pressure, and temperature of your fluid and the pumping systems
of your production facility. This efficient pumping system will help in saving water,
energy costs, and manual labor expenses.

5. Energy M anagement & M onitoring other environmental parameters


Another important contribution of IoT tech is in the area of achieving environmental
sustainability. Whether it’s fuel or electricity to use, water to drink, or fresh air to
breathe, all three are getting scarce day by day. But IoT enables us to manage the
environmental parameters with ease and efficiency allowing us to conserve them for
present and future usage.
With the IoT-enabled energy management system, you can save around 29% on the
electricity consumption costs in a building. This will also help in reducing GHG that
are responsible for global warming and its induced climate change.

Similarly, groundwater pollution is a critical issue all over the world and using an
IoT system comprising piezometers and sensors, groundwater can be monitored and
managed efficiently, allowing us to take necessary actions whenever required. In
fact, in India, the regulatory body known as CGWA has made it mandatory for
manufacturing industries to install a groundwater monitoring telemetry system and
send the report over to the regulatory body.

Lastly, with the help of an IoT-based stack monitoring system, the CO2 emissions
released from various industries can be regulated, which is also a mandatory
guideline issued by CPCB that has to be followed by manufacturing plants.

Various IOT manufacturing plants.

1. Digital/connected factory: IoT enabled machinery can transmit


operational information to the partners like original equipment
manufacturers and to field engineers.

2. Facility management: The use of IoT sensors in manufacturing equipment


enables condition-based maintenance alerts.

3. Production flow monitoring: IoT in manufacturing can enable the


monitoring of production lines starting from the refining process down to
the packaging of final products.

4. Inventory management: IoT applications permit the monitoring of events


across a supply chain.

5. Plant Safety and Security: IoT combined big data analysis can improve
the overall workers’ safety and security in the plant. .

6. Quality control: IoT sensors collect aggregate product data and other
third-party syndicated data from various stages of a product cycle.

7. Packaging Optimization: By using IoT sensors in products and/or


packaging, manufacturers can gain insights into the usage pattern s and
handling of product from multiple customers.

8. Logistics and Supply Chain Optimization: The Industrial IoT (IIoT) can
provide access to real-time supply chain information by tracking
materials, equipment, and products as they move through the supply
chain.
Challenges & Benefits in implementing IIOT.

Challenges for IoT

1. Security: Security is the most significant challenge for the IoT. Increasing
the number of connected devices increases the opportunity to exploit security
vulnerabilities, as do poorly designed devices, which can expose user data to
theft by leaving data streams inadequately protected and in some cases
people’s health and safety can be put at risk.
2.Privacy: The IoT creates unique challenges to privacy, many that go
beyond the data privacy issues that currently exist. Much of this stems from
integrating devices into our environments without us consciously using
them. This is becoming more prevalent in consumer devices, such as
tracking devices for phones and cars as well as smart televisions.
3.Scalability: Billions of internet-enabled devices get connected in a huge
network, large volumes of data are needed to be processed. The system that
stores, analyses the data from these IoT devices needs to be scalable.
4.Interoperability: Technological standards in most areas are still
fragmented. These technologies need to be converged. Which would help us
in establishing a common framework and the standard for the IoT devices.
As the standardization process is still lacking, interoperability of IoT with
legacy devices should be considered critical. This lack of interoperability is
preventing us to move towards the vision of truly connected everyday
interoperable smart objects.
5.Bandwidth: Connectivity is a bigger challenge to the IoT than you might
expect. As the size of the IoT market grows exponentially, some experts are
concerned that bandwidth-intensive IoT applications such as video
streaming will soon struggle for space on the IoT’s current server-client
model.
6.Standards: Lack of standards and documented best practices have a
greater impact than just limiting the potential of IoT devices. Without
standards to guide manufacturers, developers sometimes design products
that operate in disruptive ways on the Internet without much regard to their
impact. If poorly designed and configured, such devices can have negative
consequences for the networking resources they connect to and the broader
Internet.

7. Regulation: The lack of strong IoT regulations is a big part of why the IoT
remains a severe security risk, and the problem is likely to get worse as the
potential attack surface expands to include ever more crucial devices. When
medical devices, cars and children’s toys are all connected to the Internet,
it’s not hard to imagine many potential disaster scenarios unfolding in the
absence of sufficient regulation.

Benefits in implementing IIOT

IoT offers a number of benefits to organizations, enabling them to:

1. Monitor their overall business processes;


2. Improve the customer experience;
3. Save time and money;
4. Enhance employee productivity;
5. Integrate and adapt business models;
6. Make better business decisions; and
7. Generate more revenue.
UNIT-II
IIoT Architecture
Industrial IoT: Business Model and Reference Architecture: IIoT-Business
Models, Industrial IoT- Layers: IIoT Sensing, IIoT Processing, IIoT Communication,
IIoT Networking

Business M odel and Reference Architecture

IIoT, Business Model and Referece Architecture.


1. A business model basically it captures the different aspects such as the
rationale behind how the organization is created, how it is going to deliver
value to the customers, capturing the value, delivering the value, and so on.
2. Business model is organizational and the financial architecture of a
business.
There are different types of IIoT business models.
1. cloud-based business model,
2. service-oriented business model
3. process-oriented business model
1.Cloud-based business model:
1. Based on cloud cloud-services. So, cloud-services means, offering
cloudbased processing capabilities, storage capabilities of the data, the
data storage, the virtualization of the operating system.
2. Different aspects of the cloud-based business model. like
infrastructureas-a-service model (IaaS),
3. Platform as a service: the different applications, integration of different
applications, which have been developed in different platforms integration
of it under a common platform; these are platform-as-aservice model.
4. software-as-a-service model: offering online capable a capabilities and
customized applications to different customers
5. primarily these are the three different types of service models,
cloudbased service models
2.Service-oriented business model:
1. it is all about services
2. Service offerings such as the primary utilization, the data that is collected,
analysis of the data, aggregation of the data.

3.Process-oriented business model:


1. Reduced down time, increased machinery availability, these are important
considerations in the process oriented business model.
2. Optimized, that means, that you increase the availability of these machinery
to different customers

Challenges:
1. security and data privacy
2. lack of interoperability.
3. Increased complexity.
4. Increased cost
IIoT reference architecture:
1. IIoT reference architecture is governed by the Industrial Internet
Reference Architecture (IIRA)
2. IIRA - Industrial Internet Reference Architecture is the architectural
standard, that is used for most of these IIoT applications in these
industries. So, it is a standard based architecture
3. Safety is the major concern in the IIRA infrastructure, and is to be
followed by security
IIRA-Architecture Patterns:
Different IIoT architecture implementation patterns are as follows:
1. Three-tier architecture pattern:
The three different layers
1. the edge layer,
2. the platform layer
3. the enterprise layer.
Edge layer: Edge layer gathers data from the edge nodes. The architecture
includes
 breadth of distribution
 governance
 location
2. Platform layer: basically, it is concerning receiving, processing, and forwarding
control commands from the enterprise layer to the edge layer.
3. enterprise layer: Enterprise layer receives data flows from edge layer and
platform layer. The Enterprise layer implements
 domain-specific applications,
 decision support systems, and
 provides interfaces to end-users.
concerns receiving data flows from the edge layer and the platform layer.
Gateway-mediated edge connectivity and management architecture pattern:

The gateway-mediated edge architecture consists of


 a local area network for the IIoT edge system, and
 the gateway connecting the Wide Area Network.
 The local area network may use
 hub-and-spoke topology
 mesh topology

The gateway devices act as


 management point for the edge devices locally
 data transfer, processing and analytics
 local connectivity among the devices
 application logic which performs within the local scope.

IIRA: Layered Databus Pattern

Smart machines are present in the lowest level for

1) local control,
2) automation.
3) System of systems allows
4) complex systems,
5) monitoring, and
6) analytic applications
Layered Databus pattern is applicable in the field of

 control,
 local monitoring, and
 analytics.
 The databus communicates between applications and devices.
 It allows interoperable communication between endpoints.
 For communication between machines, another databus is used.

Layered Databus pattern allows

 fast device-to-device integration with minimum response time.


 automatic data and application delivery
 scalable integration of devices
 availability of the system is high, and
 hierarchical subsystem isolation

IIoT sensing

IIoT sensors are industrial sensors with integrated sensor and computing
functions that are connected to larger systems via wireless communications
technology. They are a key part of the industrial internet of things (IIoT), the
industrial extension within the internet of things (IoT): In this emerging
paradigm, the connected nature of the internet extends to the physical world,
where individual objects receive their IP address, technology, and wireless
connectivity. The increasing availability of compact, high-quality, affordable
sensors is a major driver for IIoT. This synergy between the digital and physical
worlds is particularly important for industrial applications, where sensors have
traditionally operated in isolation and required local monitoring.
Temperature Sensor Interfacing Circuit

 Monitoring temperature of used devices in industrial applications


 LM 35 temperature sensor generates analog voltage
 The output voltage of LM 35 is linearly proportional to Celsius
temperature

Accelerometer Sensor Interfacing Circuit

voltage

Gas Sensor Interfacing Circuit

concentration of different gases

-2 provides the concentration of LPG, propane,and hydrogen


in analog voltage
Sensors in IIoT Applications

Temperature sensor

Monitoring temperature of used devices in industrial applications such as

petrochemical, defense, aerospace, consumer electronics, and automotive

Used in some special types of application where a specific temperature is

to be maintained, such as fabricate medical drugs and heat liquids.

M agnetostrictive sensor

-varying stresses or strains in ferromagnetic

materials

detection of vehicle safety

Torque sensor

l and hydraulic systems

Vacuum sensor

detection, cathode ray tubes, gas turbine, and helium leak


Acceleration sensor

Speed sensor

given time

-powered generator, anti-lock brake,

printer, memory, engine-powered compressor

PIR sensor

cts infrared radiations coming from human body in its surrounding area

staircase, and shopping Mall

Image sensor

structured lighting, and motion capture

space, security, automotive, biometrics, medical, and machine vision

Ultrasonic sensor

nd dynamic body

detection

Applications: Liquid level monitoring of tank, trash level monitoring,

manufacturing process, automobile, and people detection for counting

 Optical sensor
 Radiation sensor
 Level sensor
 Flow sensor
 Touch sensor
 Gas sensor
Industrial Communication

Typical industrial communication requirements

 Real-time
 Very low duty-cycle
 Very low latency
 Very low jitter

Industrial Communication majorly thrives on the following technologies:

 Industrial Ethernet
 Industrial Ethernet protocols for real-time control and automation.
 Used in manufacturing processes dealing with clock synchronization and
performance.

Fieldbus

1. A communication standard for Local Area Network (LAN) of field devices


for industrial automation.
2. Used in manufacturing processes dealing with periodic I/O data transfer

Industrial Ethernet

1. ModBus-TCP
2. EtherCat
3. EtherNet/IP
4. Profinet
5. TSN

M odBus-TCP

A standard communication protocol used in industry, developed by


Modicon Inc (Schneider Electric). It uses TCP/IP & Ethernet for data
transmission between two compatible devices.The communicating system
includes several devices:

-Server devices linked to a TCP/IP network

– bridge or router or gateway

-network to grant links between client-serve

Features of M odBus-TCP

l defines 2 units in the data frame: PDU (Protocol Data Unit ) and
ADU (Application Data Unit)
ADU is identified by a header called MBAP

Features of ModBus-TCP (contd.)

-oriented protocol following the Client-Server architecture.

Masters are the clients, whereas slaves are denoted as servers.

EtherCat

EthernetCAT (Control Automation Technology) was developed by the ETG (EtherCAT


Technology Group).

ed on IEC 61158 & IEC 61784 (international standards).

-slave architecture utilizing the standard IEEE 802.3.

-sensitive scenario (due to high-speed of the system)

Master and slave exchange data as PDO (process data objects)/telegram.

ay for the telegrams.

Data exchange provide low duty cycle time of and low jitter for better
synchronization.

en the individual participants.


(Using optical waveguides: up to 20 km).
– tree, star, line, ring, or hybrid.

EtherNet/IP

It is based on the standard Internet Protocol suite and IEEE 802.3.

-based, object oriented procedure


intended for automation applications.

which is offered by Ethernet.

Communication Type

Explicit

-purpose transmission path between devices.

-critical information.

-purpose transmission paths between a master

and several clients.

-time I/O data

Based on active star topology.

-up, operation, maintenance, and expansion.

per packet.

(Programmable Logic Controllers).


Fieldbus

-RTU

-Link

Profibus (Process field bus)

It is based on the standard IEC 61158. It was first started in Germany in late
1980s and then used by Siemens. It is a field-bus technology that supports
several protocols. It supports cyclic as well as acyclic data transmission,
isochronous messaging, and alarm-handling.

Variants of Profibus

iants:

It supports 32 devices at a time (up to 1900 m, up to 10 Km with 4


repeaters).

environment).

It defines 2 layers:

ta link - accomplished over a FDL (Field bus Data Link).

the system.
and can
support branches.

P supports data as well as power transmission.


Interbus

spatially arranged I/O modules which connects to several sensors & actuators.

Application areas: sensing-actuating application, machine & system production,


and process engineering.

Features of Interbus

rs, and the


last subscriber closes the ring.)

communication.

ted shift register ring with master


the starting-ending point, while slave as a part of it.

IIoT Networking

IIoT Network Protocols -

M essage Queue Telemetry Transport (M QTT) - It was introduced by IBM and


standardized by OASIS in 2013. MQTT is based on the concept of
Publish/Subscribe. The advantages of MQTT is that a Publish/Subscribe
framework has been proposed, which is very suitable for IoT, because IoT
devices typically would be publishing data, sensing data, publishing the data.
And, with the help of the subscribers and the clients, who will try to pull the
data out of the published, data that is buffered somewhere in some agent. So,
this kind of architecture is suitable for IoT and it has the advantage of being
reliable, lightweight, and cost-effective. Quality of service (QoS) is very
important. So, for QoS of MQTT protocol, there are different transactions that
will have to be taken into consideration. The first transaction is basically
between the publishing client and the MQTT server. The second transaction is
between the MQTT server and the subscribing client. MQTT QoS levels are as
follows:
QoS 0 - Also known as "at most once" delivery. Best effort & unacknowledged
data service. Publisher transmits the message one time to the server and the
server transmits it one time to the subscriber. No scope for retry

QoS 1 - Also known as "at least once" delivery. Retry is performed until the
acknowledgment of the message is received.

QoS 2 - Also known as "exactly once" delivery. Ensuring that the retry is
performed until the message is delivered exactly once

Constrained Application Protocol (CoAP) - It is a kind of application layered


protocol. It is a kind of session protocol. It is a protocol, which helps ensure
running different APIs, different applications in IoT. CoAP defines four types of
messages:

Conformable message - The recipient must exactly explicitly either


acknowledge or reject the message.

Non-conformable message - The recipient sends the reset message if it cannot


process the message.

RST: Reset -Acknowledgment

Extensible M essaging and Presence Protocol (XM PP) - It is again based on


publishing, subscribe, a model that we talked about in the context of MQTT.
The communication protocol, XMPP is based on XML, and it uses DTLS secure
transport layer at the bottom in the transport layer for transport layer security.
This model is decentralized; that means, there is no requirement for having a
centralized server. And, it has manifold advantages such as it supports
interoperability between heterogeneous networks, heterogeneous devices, and
heterogeneous agents. It supports extensibility; that means, supporting privacy
lists, multi-user chat, publish/subscribe chat, status notifications, etc.

Advanced M essage Queuing Protocol (AM QP) - This is also based on the
publish/subscribe models like MQTT and XMPP. And, it supports two types of
the framework: one is the point to point communication and the other one is
multi-point communication and is typically used for application such as
financial applications, and digital finance. It uses a token-based mechanism for
flow control, which ensures that there is no buffer overflow at the receiving end.
So, flow control is all about the use of a token-based mechanism.

IEEE 1888 - This one is an energy-efficient network control protocol, which


defines a generalized data exchange protocol between the network components
over IPv4 or IPv6. It talks about the use of resource universal resource
identifiers and supports different applications for environmental monitoring,
energy-saving, central management systems, and so on.

DDS RTPS - The full form of this thing is Distributed Data Service Real-Time
Publish and Subscribe. It is very much attractive for use in IoT networks, this
support Publish/Subscribe framework on top of the UDP transport layer
protocol. So, it is a data-centric binary protocol and this data in this context are
termed as “topics”. There are topics that mean like there are users, which
subscribe to a particular topic of interest and the listeners listen to these. There
is a single topic that may have multiple speakers of different priorities and this
supports enlisted QoS for data distribution in terms of data persistence,
maintaining, ensuring, delivery deadline, reliability, the freshness of data, and
in a different protocol. The application such as military, industrial, and
healthcare monitoring are the ones that find this particular protocol to be of
use.
UNIT-III
IIOT ANALYTICS
Big Data Analytics and Software Defined Networks, Machine Learning and Data Science, Julia
Programming, Data Management with Hadoop.

Big Data Analytics

IoT data is just a curiosity, and it’s even useful if handled correctly. However,
given time, as more and more devices are added to IoT networks, the data
generated by these systems becomes overwhelming.
The real value of IoT is not just in connecting things but rather in the data
produced by those things, the new services you can enable via those connected
things, and the business insights that the data can reveal.

However, to be useful, the data needs to be handled in a way that is organized


and controlled. Thus, a new approach to data analytics is needed for the
Internet of Things.
Introduction to Data Analytics for IOT

In the world of IoT, the creation of massive amounts of data from sensors is
common and one of the biggest challenges—not only from a transport
perspective but also from a data management standpoint.

Analysing large amount of data in the most efficient manner possible falls
under the umbrella of data analytics.

Data analytics must be able to offer actionable insights and knowledge from
data, no matter the amount or style, in a timely manner, or the full benefits of
IoT cannot be realized.

Example:

Modern jet engines are fitted with thousands of sensors that generate a
whopping 10GB of data per second may be equipped with around 5000

sensors. Therefore, a twin engine commercial aircraft with these engines


operating on average 8 hours a day will generate over 500 TB of data daily, and
this is just the data from the engines! Aircraft today have thousands of other
sensors connected to the airframe and other systems.

In fact, a single wing of a modern jumbo jet is equipped with 10,000 sensors.

The potential for a petabyte (PB) of data per day per commercial airplane is

not farfetched—and this is just for one airplane. Across the world, there are
approximately 100,000 commercial flights per day. The amount of IoT data
coming just from the commercial airline business is overwhelming.
IIoT Analytics: Data Science

 Big Data Analytics


 Volume, velocity, variability, veracity, variety
 Industrial automation, system health monitoring, predictive
maintenance, remote monitoring
 Artificial Intelligence
 Deep Learning (DL)
 Machine Learning (ML)

Instead of physics-based models, ML and DL enable a data-driven system

modelling approach

Key concepts related to data


➢ Not all data are same it can be categorized and thus analysed in different
ways.

➢ Depending on how data is categorized, various data analytics tools and


processing methods can be applied.

➢ Two important categorizations from an IoT perspective are whether the


data is structured or unstructured and whether it is in motion or at rest.

Structured versus Unstructured Data

Structured data and unstructured data are important classifications as they


typically require different toolsets from a data analytics perspective.

Structured data means that the data follows a model or schema that defines
how the data is represented or organized, meaning it fits well with a traditional
relational database management system (RDBMS).

Simply put, structured data typically refers to highly organized, stored


information that is efficiently and easily searchable.

IoT sensor data often uses structured values, such as temperature, pressure,
humidity, and so on, which are all sent in a known format. Structured data is
easily formatted, stored, queried, and processed; for these reasons, it has been
the core type of data used for making business decisions.

Because of the highly organizational format of structured data, a wide array of


data analytics tools are readily available for processing this type of data.

From custom scripts to commercial software like Microsoft Excel and Tableau,
most people are familiar and comfortable with working with structured data.

Unstructured data lacks a logical schema for understanding and decoding the
data through traditional programming means.

Examples of this data type include text, speech, images, and video. As a general
rule, any data that does not fit neatly into a predefined data model is classified
as unstructured data. such as cognitive computing and machine learning, are
deservedly garnering a lot of attention.

According to some estimates, around 80% of a business’s data is


unstructured.2 Because of this fact, data analytics methods that can be applied
to unstructured data, such as cognitive computing and machine learning, are
deservedly garnering a lot of attention.

With machine learning applications, such as natural language processing (NLP),


you can decode speech. With image/facial recognition applications, you can
extract critical information from still images and video. The handling of
unstructured IoT data employing machine learning techniques is cove red in
more depth later in this chapter.

Semi-structured data is sometimes included along with structured and


unstructured data. As you can probably guess, semi-structured data is a hybrid
of structured and unstructured data and shares characteristics of both. While
not relational, semi-structured data contains a certain schema and consistency.
Email is a good example of semi-structured data as the fields are well defined
but the content contained in the body field and attachments is unstructured.

Smart objects in IoT networks generate both structured and unstructured data.
Structured data is more easily managed and processed due to its welldefined
organization.

On the other hand, unstructured data can be harder to deal with and typically
requires very different analytics tools for processing the data.

Data in M otion versus Data at Rest

As in most networks, data in IoT networks is either in transit (“data in motion”)


or being held or stored (“data at rest”).

Examples of data in motion include traditional client/server exchanges, such as


web browsing and file transfers, and email.

Data saved to a hard drive, storage array, or USB drive is data at rest.

➢ From an IoT perspective, the data from smart objects is considered data
in motion as it passes through the network en route to its final destination.

➢ This is often processed at the edge, using fog computing. When data is
processed at the edge, it may be filtered and deleted or forwarded on for further
processing and possible storage at a fog node or in the data center.

➢ Data does not come to rest at the edge.

➢ When data arrives at the data center, it is possible to process it in real-


time, just like at the edge, while it is still in motion

➢ Tools with this sort of capability, such as Spark, Storm, and Flink, are
relatively nascent compared to the tools for analysing stored data.

Data at rest in IoT networks can be typically found in IoT brokers or in some
sort of storage array at the data center. Myriad tools, especially tools for
structured data in relational databases, are available from a data analytics
perspective.

The best known of these tools is Hadoop. Hadoop not only helps with data
processing but also data storage. IoT Data Analytics Overview

The true importance of IoT data from smart objects is realized only when the
analysis of the data leads to actionable business intelligence and insights.

Data analysis is typically broken down by the types of results that are
produced.

Descriptive: Descriptive data analysis tells you what is happening, either now or
in the past.

Diagnostic: When you are interested in the “why,” diagnostic data analysis can
provide the answer.

Predictive: Predictive analysis aims to foretell problems or issues before they


occur.

Prescriptive:Prescriptive analysis goes a step beyond


predictive and recommends solutions for upcoming problems.

Both predictive and prescriptive analyses are more resource intensive and
increase complexity, but the value they provide is much greater than the value
from descriptive and diagnostic analysis.

Figure 7-4 illustrates the four data analysis types and how they rank as
complexity and value increase. You can see that descriptive analysis is the least
complex and at the same time offers the least value. On the other end,
prescriptive analysis provides the most value but is the most complex to
implement.

Most data analysis in the IoT space relies on descriptive and diagnostic
analysis, but a shift toward predictive and prescriptive analysis is
understandably occurring for most businesses and organizations.

IoT Data Analytics Challenges

IoT data places two specific challenges on a relational database:

Scaling problems: Due to the large number of smart objects in most IoT
networks that continually send data, relational databases can grow incredibly
large very quickly. This can result in performance issues that can be costly to
resolve, often requiring more hardware and architecture changes.

Volatility of data: With relational databases, it is critical that the schema be


designed correctly from the beginning. Changing it later can slow or stop the
database from operating. Due to the lack of flexibility, revisions to the schema

must be kept at a minimum. IoT data, however, is volatile in the sense that the
data model is likely to change and evolve over time.
Some other challenges:

• IoT also brings challenges with the live streaming nature of its data and
with managing data at the network level. Streaming data, which is generated as
smart objects transmit data, is challenging because it is usually of a very high
volume, and it is valuable only if it is possible to analyse and respond to it in
real-time.

• Real-time analysis of streaming data allows you to detect patterns or


anomalies that could indicate a problem or a situation that needs some kind of
immediate response. To have a chance of affecting the outcome of this problem,
you naturally must be able to filter and analyse the data while it is occurring,
as close to the edge as possible.

• The market for analysing streaming data in real-time is growing fast.


Major cloud analytics providers, such as Google, Microsoft, and IBM, have
streaming analytics offerings, and various other applications can be used in
house.

• Another challenge that IoT brings to analytics is in the area of network


data, which is referred to as network analytics. With the large numbers of smart
objects in IoT networks that are communicating and streaming data, it can be
challenging to ensure that these data flows are effectively managed, monitored,
and secure. Network analytics tools such as Flexible NetFlow and IPFIX provide
the capability to detect irregular patterns or other problems in the flow of IoT
data through a network

Software Defined Networking in IoT

Software−defined Networking in the Internet of Things (IoT) presents a


formidable architecture that enhances the adaptability and flexibility of
networks. By seamlessly abstracting multiple network layers, SDN
revolutionizes network control, empowering enterprises and service providers to
swiftly adapt to evolving business demands. This cutting−edge approach seeks
to optimize network management and empower organizations with the agility
needed to thrive in an ever−changing digital landscape.

SDN's inherent ability to provide abstractions empowers network


administrators to exert holistic control over the network, utilizing high−level
policies without having to concern themselves with the intricacies of low−level
configurations. Consequently, leveraging SDN proves advantageous in
addressing the heterogeneous nature of IoT and catering to its unique
application−specific demands.
Types of Software Defined Networking

Open SDN: Experience the power of open protocols as they orchestrate and
govern both virtual and physical devices, seamlessly directing the flow of data
packets.

API SDN: Unleash the potential of programming interfaces, known as


southbound APIs, to regulate the intricate exchange of data between devices,
ensuring efficient data flow management.

Overlay M odel SDN: Embark on a virtual networking journey that transcends


physical limitations. Overlay Model SDN constructs a virtual network layer
above existing hardware infrastructure, encompassing data tunnels and
channels to data centers. This innovative model skillfully allocates bandwidth
within each channel and effectively assigns devices to their designated
channels.

Hybrid M odel SDN: Embrace the best of both worlds with the Hybrid Model
SDN. By seamlessly blending the realms of SDN and traditional networking,
this versatile approach enables the optimal selection of protocols for various
traffic types. Harness the power of Hybrid SDN as a phased implementation
strategy for a smooth transition into the world of SDN.

Significance of Software Defined Netw orking in IoT


Software−Defined Networking (SDN) in the Internet of Things (IoT) signifies a
considerable improvement over traditional networking, delivering a range of
essential benefits:

Enhanced Control with Unparalleled Speed and Flexibility: SDN elimin ates the
need for manual configuration of various hardware devices from different
vendors. Instead, developers can exert control over network traffic by
programming a software based controller adhering to open standards. This
approach empowers networking managers with the freedom to select
networking equipment and communicates with multiple hardware devices using
a single protocol via a centralized controller, resulting in remarkable speed and
flexibility.

Customizable Network Infrastructure: With SDN, administrators can


centrally design network services and swiftly allocate virtual resources to
modify the network infrastructure. This capability allows network
administrators to prioritize applications that demand increased availability and
optimize the flow of data across the network according to specific requirements.

Robust Security: SDN in IoT offers comprehensive visibility across the entire
network, presenting a holistic view of potential security threats. As the number
of intelligent devices connecting to the Internet continues to proliferate, SDN
surpasses traditional networking in terms of security advantages. Operators
can create distinct zones for devices requiring different security levels or
promptly isolate compromised devices to prevent the spread of infections
throughout the network.

By embracing Software−Defined Networking in IoT, organizations can unlock


the potential for greater control, customization, and security within their
networks, paving the way for optimized performance and improved management
of IoT deployments.

Risks of Software Defined Networking in IoT

From bolstering agility and control to streamlining management and


configuration, SDN presents a compelling case for adoption. However, it is
imperative to acknowledge the potential risks that accompany this technological
marvel. One prominent concern lies in the centralized nature of the controller,
which, if compromised, could act as a single point of failure. Nevertheless,
proactive measures can mitigate this vulnerability by implementing controller
redundancy throughout the network, complete with automatic fail−over
capabilities. While this endeavor may incur additional expenses, it aligns with
the principles of maintaining business continuity, akin to the judicious addition
of redundancy to other critical network components.

Distinguishing Software Defined Networking in IoT from Traditional


Networking

The dissimilarity between Software−Defined Networking (SDN) and traditional


networking lies primarily in their underlying infrastructure. While traditional
networking relies on hardware components, SDN operates on a software basis. This
fundamental variance endows SDN with remarkable flexibility that surpasses the
confines of traditional networking. Through a software−driven control panel, SDN
empowers administrators to oversee the network, modify configuration settings,
allocate resources, and augment network capacity from a centralized user interface,
all without necessitating additional hardware deployment. Moreover, SDN and
traditional networking diverge in terms of security. SDN, being software defined,
boasts enhanced security attributes owing to its heightened visibility and ability to
define secure pathways. However, it is imperative to safeguard the centralized
controller as it represents a potential vulnerability and single point of failure within
SDN networks, which could compromise the network's overall security.

IIoT Analytics: M achine Learning

M achine Learning

Machine learning is a subset of Artificial Intelligence which enables machines

to make decisions based on their experience rather than being


explicitly programmed.

M achine learning Overview

Machine learning is, in fact, part of a larger set of technologies commonly


grouped under the term artificial intelligence (AI).

In fact, AI includes any technology that allows a computing system to mimic


human intelligence using any technique, from very advanced logic to basic “if-
thenelse” decision loops. Any computer that uses rules to make decisions
belongs to this realm.

A typical example is a dictation program that runs on a computer. The program


is configured to recognize the audio pattern of each word in a dictionary, but it
does not know your voice’s specifics—your accent, tone, speed, and so on.

You need to record a set of predetermined sentences to help the tool match
well- known words to the sounds you make when you say the words. This
process is called machine learning.

ML is concerned with any process where the computer needs to receive a set of
data that is processed to help perform a task with more efficiency. ML is a vast
field but can be simply divided in two main categories: supervised and
unsupervised learning.

Types of Machine Learning Algorithms

1. Unsupervised Learning

3. Reinforcement Learning

2. Supervised Learning
Unsupervised Learning

This machine learning technique is used to identify similar groups of


data,coined as clustering. The segregation of data is performed on unlabeled

dataset, based on the inner structure of the data without looking into the

specific outcome.

Supervised learning

In supervised learning, the machine is trained with input for which there is a
known correct answer. For example, suppose that you are training a system to
recognize when there is a human in a mine tunnel.

A sensor equipped with a basic camera can capture shapes and return them to
a computing system that is responsible for determining whether the shape is a
human or something else (such as a vehicle, a pile of ore, a rock, a piece of
wood, and so on.)
With supervised learning techniques, hundreds or thousands of images are fed
into the machine, and each image is labeled (human or nonhuman in this case).
This is called the training set. An algorithm is used to determine common
parameters and common differences between the images.

The comparison is usually done at the scale of the entire image, or pixel by
pixel. Images are resized to have the same characteristics (resolution, color
depth, position of the central figure, and so on), and each point is analyzed.
Human images have certain types of shapes and pixels in certain locations.

Each new image is compared to the set of known “good images,”

and a deviation is calculated to determine how different the new image is from
the average human image and, therefore, the probability that what is shown is
a human figure. This process is called classification.

After training, the machine should be able to recognize human shapes. Before
real field deployments, the machine is usually tested with unlabelled pictures—
this is called the validation or the test set, depending on the ML system used—
to verify that the recognition level is at acceptable thresholds. If the machine
does not reach the level of success expected, more training is needed.

Reinforcement Learning Algorithm

It is a machine learning algorithm which enables machines to improve

its performance by automatically learning the ideal behaviors for a

specific environment.
Data science :

Julia programming

Julia provides us unobtrusive yet a powerful and dynamic type system.

With the help of multiple dispatch, the user can define function behavior across
many combinations of arguments.It has powerful shell that makes Julia able to
manage other processes easily.The user can cam call C function without any
wrappers or any special APIs.Julia provides an efficient support for Unicode.

It also provides its users the Lisp-like macros as well as other metaprogramming
processes.It provides lightweight green threading, i.e., coroutines.

It is well-suited for parallelism and distributed computation.

The coding done in Julia is fast because there is no need of vectorization of code
for performance.

It can efficiently interface with other programming languages such as Python,


R, and Java. For example, it can interface with Python using PyCall, with R
using RCall, and with Java using JavaCall.

 Open source
 Distributed computation and parallelism possible
 Support efficiently Unicode
 Call c functions directly

Basics of Julia programming

Use println() is used to print

Variables can be assigned without defining the type

Basic math

Assigning string
Use of $ sign for string interpolation

String concatenation

Data structures
1. Tuples

Dictionary

3. Arrays
Data M anagement

chived or disposed off in a safe


and secure manner during and after the conclusion of a research project

handled electronically as well as through non-electronic means


most industrial data –

Hadoop

across large clusters of computers


-source implementation for Google File System (GFS) and
MapReduce

components originally derived respectively from Google's


MapReduce and GFS.

Building Blocks of Hadoop

aining the utilities that support the other Hadoop


components

M apReduce
that process large amount of datasets in
parallel

-generation MapReduce
Hadoop cluster.

HDFS Architecture and Components

HDFS follows the master-slave architecture and it has the following


elements.

Namenode
The namenode is the commodity hardware that contains the GNU/Linux
operating system and the namenode software. It is a software that can be
run on commodity hardware. The system having the namenode acts as
the master server and it does the following tasks −

Manages the file system namespace.


Regulates client’s access to files.
It also executes file system operations such as renaming, closing, and
opening files and directories.

Datanode
The datanode is a commodity hardware having the GNU/Linux operating
system and datanode software. For every node (Commodity
hardware/System) in a cluster, there will be a datanode. These nodes
manage the data storage of their system.

Datanodes perform read-write operations on the file systems, as per client


request.
They also perform operations such as block creation, deletion, and
replication according to the instructions of the namenode.

Block
Generally the user data is stored in the files of HDFS. The file in a file
system will be divided into one or more segments and/or stored in
individual data nodes. These file segments are called as blocks. In other
words, the minimum amount of data that HDFS can read or write is
called a Block. The default block size is 64MB, but it can be increased as
per the need to change in HDFS configuration.

Goals of HDFS
Fault detection and recovery − Since HDFS includes a large number of
commodity hardware, failure of components is frequent. Therefore HDFS
should have mechanisms for quick and automatic fault detection and
recovery.

Huge datasets − HDFS should have hundreds of nodes per cluster to


manage the applications having huge datasets.

Hardware at data − A requested task can be done efficiently, when the


computation takes place near the data. Especially where huge datasets
are involved, it reduces the network traffic and increases the throughput.

Inserting Data into HDFS


Assume we have data in the file called file.txt in the local system which is
ought to be saved in the hdfs file system. Follow the steps given below to
insert the required file in the Hadoop file system.

Step 1
You have to create an input directory.

$ $HADOOP_HOME/bin/hadoop fs -mkdir /user/input


Step 2
Transfer and store a data file from local systems to the Hadoop file
system using the put command.

$ $HADOOP_HOME/bin/hadoop fs -put /home/file.txt /user/input


Step 3
You can verify the file using ls command.

$ $HADOOP_HOME/bin/hadoop fs -ls /user/input


Retrieving Data from HDFS
Assume we have a file in HDFS called outfile.
Given below is a simple demonstration for retrieving the required file from
the Hadoop file system.

Step 1
Initially, view the data from HDFS using cat command.

$ $HADOOP_HOME/bin/hadoop fs -cat /user/output/outfile


Step 2
Get the file from HDFS to the local file system using get command.

$ $HADOOP_HOME/bin/hadoop fs -get /user/output/


/home/hadoop_tp/
Shutting Down the HDFS
You can shut down the HDFS by using the following command.

$ stop-dfs.sh
There are many more commands in "$HADOOP_HOME/bin/hadoop fs"
than are demonstrated here, although these basic operations will get you
started. Running ./bin/hadoop dfs with no additional arguments will list
all the commands that can be run with the FsShell system. Furthermore,
$HADOOP_HOME/bin/hadoop fs -help commandName will display a
short usage summary for the operation in question, if you are stuck.

A table of all the operations is shown below. The following conventions


are used for parameters −

"<path>" means any file or directory name.


"<path>..." means one or more file or directory names.
"<file>" means any filename.
"<src>" and "<dest>" are path names in a directed operation.
"<localSrc>" and "<localDest>" are paths as above, but on the local file
system.
All other files and path names refer to the objects inside HDFS.
MapReduce is a framework using which we can write applications to
process huge amounts of data, in parallel, on large clusters of commodity
hardware in a reliable manner.

Hadoop Data M anagement


Hadoop is a powerful framework for data management, storage, and
processing, particularly suited for handling large -scale, distributed
datasets. Data management in Hadoop involves various aspects,
including data ingestion, storage, organization, processing, and retrieval.
Here are key concepts and components related to Hadoop data
management:

Hadoop Distributed File System (HDFS):


HDFS is the primary storage system in Hadoop, designed to store vast
amounts of data across a cluster of commodity hardware. It divides large
files into blocks and replicates them across multiple nodes for fault
tolerance.
Data Ingestion:

Data can be ingested into Hadoop using various methods, including


batch ingestion (e.g., using tools like Sqoop or Flume), real-time
streaming (e.g., Kafka), and manual uploads.
Data Storage:

Hadoop stores data in a distributed, fault-tolerant manner across the


HDFS cluster. Data is divided into blocks, typically 128 MB or 256 MB in
size, and these blocks are replicated to ensure data durability.
Data Formats:
Hadoop supports various data formats, including text, Avro, Parquet,
ORC, and others. Choosing the right format can impact storage efficiency
and query performance.
M etadata M anagement:
Metadata about the stored data, such as file locations, block replication
levels, and file structure, is maintained by the NameNode in HDFS. It
helps track and manage data across the cluster.
Data Organization:

Data can be organized into directories and subdirectories within HDFS.


Proper organization facilitates data discovery and management.
Data Processing:

Hadoop offers the MapReduce framework, which allows for distributed


data processing. Additionally, tools like Apache Spark, Hive, Pig, and
Flink provide higher-level abstractions for data processing and analytics.
Data Retrieval:

Users and applications can retrieve data from Hadoop using various
query and analysis tools. SQL-like languages (e.g., Hive’s HQL), scripting
languages (e.g., Pig Latin), and programming languages (e.g., Java,
Python) can be used for data retrieval.
Data Security:

Hadoop provides security features like authentication, authorization, and


encryption to protect data both in transit and at rest.
Data Lifecycle Management:
Managing the lifecycle of data includes data retention policies, archiving,
data purging, and data backup strategies.
Data Quality and Governance:

Ensuring data quality, integrity, and compliance with regulatory


requirements is essential. Data governance practices and tools help
maintain data quality and compliance.
Data Catalogs and M etadata Repositories:

Metadata about data assets, such as data lineage, data definitions, and
data ownership, can be stored in data catalogs and metadata repositories
to aid in data discovery and usage.
Data Compression and Optimization:

Data compression techniques are often employed to reduce storage


requirements and improve data processing performance. Tools like
Apache ORC and Apache Parquet use columnar storage and compression
to optimize data storage and querying.
Data Backup and Disaster Recovery:
Implementing backup and disaster recovery strategies is critical to ensure
data availability and business continuity.
Data Retention Policies:
Defining and enforcing data retention policies helps manage data growth
and ensures that only relevant and necessary data is retained.
Data Privacy and Compliance:
Compliance with data privacy regulations, such as GDPR or HIPAA, is
crucial when managing sensitive or personal data within Hadoop clusters.

You might also like