BDCC Unit 1
BDCC Unit 1
UNIT-1
Big Data Science and Machine
Intelligence
School of Computer Science and Engineering
SethuMadhavi.R
[email protected]
AY: 2023-2024
INTRODUCTION TO BIG DATA
➢ What is Data?
The quantities, characters, or symbols on which operations are performed by a computer, which may be
stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical
recording media.
• “Extremely large data sets that may be analyzed computationally to reveal patterns , trends and
association, especially relating to human behavior and interaction are known as Big Data.”
EXAMPLES OF BIG DATA
1. The statistic shows that 500+terabytes of new data get ingested into the
databases of social media site Facebook, every day. This data is mainly
generated in terms of photo and video uploads, message exchanges, putting
comments etc.
TWITTER
TABULAR REPRESENTATION OF VARIOUS
MEMORY SIZES
TYPES OF DIGITAL DATA
1.Structured
2.Unstructured
3.Semi-structured
STRUCTURED
Structured
1. Any data that can be stored, accessed and processed in the form of fixed
format is termed as a 'structured' data.
2. Over the period of time, talent in computer science has achieved greater
success in developing techniques for working with such kind of data
(where the format is well known in advance) and also deriving value out
of it.
3. However, nowadays, we are foreseeing issues when a size of such data
grows to a huge extent, typical sizes are being in the range of multiple
zettabytes.
EXAMPLES OF STRUCTURED
DATA
1. Unstructured
▪ Any data with unknown form or the structure is classified as unstructured
data.
▪ In addition to the size being huge, un-structured data poses multiple challenges in
terms of its processing for deriving value out of it.
▪ Now day organizations have wealth of data available with them but unfortunately,
they don't know how to derive value out of it since this data is in its raw form or
unstructured format.
EXAMPLES OF UN-STRUCTURED DATA
❑Semi-structured
▪ Semi-structured data can contain both the forms of data.
▪ We can see semi-structured data as a structured in form but it is
actually not defined with e.g. a table definition in relational
DBMS.
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>
BIG DATA ANALYTICS
➢Big Data Analytics:
▪ Big Data analytics is the process of collecting, organizing and analyzing
large sets of data (called Big Data) to discover patterns and other useful
information.
▪ Big Data analytics can help organizations to better understand the
information contained within the data and will also help identify the data
that is most important to the business and future business decisions.
Analysts working with Big Data typically want the knowledge that comes
from analyzing the data.
THE CHALLENGES:
▪ For most organizations, Big Data analysis is a challenge. Consider the sheer
volume of data and the different formats of the
1.data(both structured and unstructured data) that is collected across the entire
organization and the many different ways different types of data can be
combined, contrasted and analyzed to find patterns and other useful business
information.
▪ The first challenge is in breaking down data silos to access all data an
organization stores in different places and often in different systems.
▪ A second challenge is in creating platforms that can pull in unstructured
data as easily as structured data.
▪ This massive volume of data is typically so large that it's difficult to process
using traditional database and software methods.
APPLICATION OF BIG DATA
HERE IS THE LIST OF TOP BIG DATA APPLICATIONS IN
TODAY’S WORLD:
• Big Data in Healthcare
▪ Some of the biggest E-commerce companies of the world like Amazon, Flipkart, Alibaba, and
many more are now bound to Big Data and analytics is itself an evidence of the level of
popularity Big Data has gained in recent times.
▪ Viewers these days need content according to their choices only. Content that is
relatively new to what they saw the previous time. Earlier the companies
broadcasted the Ads randomly without any kind of analysis.
▪ Customers are now the real heroes of the Media and entertainment industry -
courtesy to Big Data and Analytics.
6. Big Data in Finance
▪ The functioning of any financial organization depends heavily on its data and to safeguard that
data is one of the toughest challenges any financial firm faces. Data has been the second most
important commodity for them after money.
▪ Digital banking and payments are two of the most trending buzzwords around and Big data
has been at the heart of it. Big Data is bossing the key areas of financial firms such as fraud
detection, risk analysis, algorithmic trading, and customer contentment.
▪ This has brought much-needed fluency in their systems. They are now empowered to focus
more on providing better services to their customers rather than focussing on security issues.
Big Data has now enhanced the financial system with answers to its hardest of the challenges.
7. Big Data in Travel Industry
▪ While Big Data is spreading like wildfire and various industries have been cooking its food
with it, the travel industry was a bit late to realize its worth. Better late than never though.
Having a stress-free traveling experience is still like a daydream for many.
▪ And now Big Data’s arrival is like a ray of hope, that will mark the departure of all the
hindrances in our smooth traveling experience.
▪ From providing them with the best offers to be able to make suggestions in real-time,
Big Data is certainly a perfect guide for any traveler. Big Data is gradually taking the
window seat in the travel industry.
8. Big Data in Telecom
▪ The telecom industry is the soul of every digital revolution that takes place around the world.
With the ever-increasing popularity of smartphones, it has flooded the telecom industry with
massive amounts of data.
▪ And this data is like a goldmine, telecom companies just need to know how to dig it properly.
Through Big Data and analytics, companies are able to provide the customers with smooth
connectivity, thus eradicating all the network barriers that the customers have to deal with.
▪ Companies now with the help of Big Data and analytics can track the areas with the
lowest as well as the highest network traffics and thus doing the needful to ensure
hassle-free network connectivity.
▪ Big Data alike other industries have helped the telecom industry to understand its customers
pretty well.
▪ Telecom industries now provide customers with offers as customized as possible.
▪ Big Data has been behind the data revolution we are currently experiencing.
Enabling Technologies for Big Data
Computing
Data Science and Related
Disciplines
THE EVOLUTION OF BIG DATA
DATA SCIENCE AND RELATED DISCIPLINES
3. Big Data refers to digital data volume, velocity and/or variety whose
management requires scalability across coupled horizontal resources
FUNCTIONAL COMPONENTS OF DATA SCIENCE SUPPORTED
BY SOME SOFTWARE LIBRARIES ON THE CLOUD IN 2016
Data
Visualization
Data
Social Network &
Science
Graph Analysis
Programming
Statistics
Skills Math
Algorithms Statistics
Hadoop
Distributed
Computing
•Expand immersive
experiences
•Accelerate artificial
intelligence (AI)
automation
•Optimize technologist
delivery
FROM HPC SYSTEMS AND CLUSTERS TO GRIDS, P2P
NETWORKS, CLOUDS, AND THE INTERNET OF THINGS
1. The general computing trend is to leverage more and more on shared web
resources over the Internet.
2. The evolution is from two tracks of system development:
HPC versus HTC systems.
3. On the HPC side, supercomputers are gradually replaced by clusters of
cooperative computers out of a desire to share computing resources.
4. The cluster is often a collection of homogeneous computer nodes that are
physically connected in close range to each other.
HTC
1. On the HTC side, Peer-to-Peer (P2P) networks are formed for distributed
file sharing and content delivery applications.
2. Both P2P, cloud computing and web service platforms place more
emphasis on HTC rather than HPC applications.
3. In the big data era, we are facing a data deluge problem. Data comes from
IoT sensors, lab experiments, simulations, society archives and the web in
all scales and formats.
4. The Internet and WWW are used by billions of people every day. As a
result, large data centers or clouds must be designed to provide not only
big storage but also distributed computing power to satisfy the requests
of a large number of users simultaneously.
HPC: High-Performance
Computing
HTC: High-Throughput
Computing
P2P: Peer to Peer
MPP: Massively Parallel
Processors
RFID: Radio Frequency
Identification
CONVERGENCE OF TECHNOLOGIES
Hardware
Hardware Virtualization
Multi-core chips
SoA,
Distributed Utility Web 2.0 Internet
Computing and Grid Cloud Computing Technology
Services
Computing
Autonomic Computing,
Datacenter Automation
Systems Management
UTILITY COMPUTING
1. Utility computing is based on a business model, by which customers
receive computing resources from cloud or IoT service providers.
2. This demands some technological challenges, including almost all
aspects of computer science and engineering.
3. For example, users may demand new network-efficient processors,
scalable memory and storage schemes, distributed OS, middleware for
machine virtualization, new programming models, effective resource
management and application program development.
4. These hardware and software advances are necessary to facilitate mobile
cloud computing in various IoT application domains.
CLOUD COMPUTING VERSUS ON-PREMISE
COMPUTING
1. On-premise computing differs from cloud computing mainly in resources
control and infrastructure management.
2. In Table, we compare three cloud service models with the on-premise
computing paradigm.
3. We consider hardware and software resources in five types:
storage, servers, virtual machines, networking and application software.
4. In the case of on-premise computing at local hosts, all resources must be
acquired by the users except networking, which is shared between users and
the provider.
Eg: MS office, Adobe
CLOUD COMPUTING VERSUS ON-PREMISE
COMPUTING
TOWARDS A BIG DATA INDUSTRY 1. At the time of 1960-1990 most
data blocks were measured as
MB, GB and TB.
2. Datacenters became widely in
use from 1980 to 2010, with
datasets easily ranging from
TB to PB or even EB.
3. After 2010, big data was
introduced.
4. To process big data in the
future, we expect EB to ZB or
YB. The market size of the big
data industry reached 34
billion in 2013.
Interactive SMACT Technologies
INTERACTIVE SMACT
TECHNOLOGIES
UNIT-1
Big Data Science and Machine
Intelligence
School of Computer Science and Engineering
SethuMadhavi.R
[email protected]
AY: 2022-2023
THE INTERNET OF THINGS
1. The term Internet of Things (IoT) is a physical concept. The size of the IoT
can be large or small, covering local regions or a wide range of physical
spaces.
2. IoTs are built in the physical world, even though they are logically
addressable in cyberspace.
3. The importance is to connect any things at any time and any place at low
cost.
4. The dynamic connections will grow exponentially into a new universal
network of networks, called IoT. The IoT is strongly tied to specific
application domains.
INTERACTIONS AMONG SMACT SUBSYSTEMS
INTERACTIONS AMONG SMACT SUBSYSTEMS
3. Machine Learning and Big Data Analytics: This is the foundation to use cloud’s
computing power to analyze large datasets scientifically or statistically. Special
computer programs are written to automatically learn to recognize complex patterns
and make intelligent decisions based on the data.
TECHNOLOGY FUSION TO MEET THE FUTURE
DEMAND
1. The joint use of clouds, IoT, mobile devices and social networks is crucial
to capture big data from all sources.
2. This integrated system enables fast, efficient and intelligent interactions
among humans, machines and any objects surrounding us.
3. The combined use of two or more technologies may demand additional
efforts to integrate them for the common purpose.
4. All five SMACT technologies are deployed within the mobile Internet.
5. Social networks and big data analysis subsystems are built in the Internet
with fast database search and mobile access facilities.
6. High storage and processing power are provided by domain-specific
cloud services on dedicated platforms.
Social-Media, Mobile Networks and
Cloud Computing
Social Networks and Web Service
Sites
SOCIAL NETWORKS AND WEB SERVICE SITES
2.9 billion
monthly
574.4 million
monthly wise
users
875 million
365
million
monthly
wise users
EXAMPLE- FACEBOOK PLATFORM ARCHITECTURE AND
SOCIAL SERVICES PROVIDED
SMS Mutimedia
Experience reliability, lower latency, larger
versatility and application-domain
Mobile Networks
TDMA
Network together with massive multi-input
multi-output (MIMO), advanced
Wireless Communications
1. RANs (Radio Access Networks) are used to access the mobile core
networks, which are connected to the Internet backbone and many
Intranets through mobile Internet edge networks.
2. Such an Internet access infrastructure is also known as the wireless
Internet or mobile Internet.
3. There are several classes of RANs known as WiFi, Bluetooth, WiMax and
Zigbee networks.
4. There are several short-range wireless networks, such as wireless local-
area network (WLAN), wireless home-area network (WHAN), personal
area network (PAN) and body-area network (BAN), etc.
THE INTERACTIONS OF VARIOUS RADIO-ACCESS NETWORKS
(RANS) WITH THE UNIFIED ALL-IP BASED MOBILE CORE
NETWORK, INTRANETS AND THE INTERNET.
BLUETOOTH DEVICES AND NETWORKS
1. Bluetooth is a short-range radio technology, operates in 2.45 GHz
industrial scientific medical band.
2. It transmits omni-directional (360◦) signals with no limit data or voice.
3. It supports up to 8 devices (1 master and 7 slaves) in a PAN called
Piconet.
4. Bluetooth devices have low cost and low power requirements.
5. The device offers a data rate of 1 Mbps in adhoc networking with 10 cm
to 10 meters in range.
6. It supports voice or data communication between phones, computers and
other wearable devices.
WIFI NETWORKS
1. The access point broadcasts its signal in a radius of less than 300 ft.
2. The closer it is to the access point, the faster will be the data rate
experienced.
3. The maximum speed is only possible within 50–175 ft. The peak data
rates of WiFi networks have improved from less than 11 Mbps to 300
Mbps.
4. The network uses OFDG (orthogonal frequency-division multiplexing )
modulation technology with the use of multiple input and multiple output
(MIMO) radio and antenna to achieve its high speed.
5. WiFi enables the fastest WLAN in a mesh of access points or wireless
routers.
Mobile Cloud Computing
Infrastructure
MOBILE CLOUD COMPUTING INFRASTRUCTURE
1. Mobile cloud computing is a model for elastic augmentation of mobile
device capabilities via wireless access to cloud storage and computing
resources.
2. This is further enhanced by context-aware dynamic adaption to the
changes in the operating environment.
3. With the support of mobile cloud computing (MCC), a mobile user has a
new cloud option to execute its application.
4. The user attempts to offload the computation through WiFi, cellular
network or satellite to the distant clouds.
5. The cellphone itself is infeasible to finish some compute-intensive tasks.
Instead, the data related to the computation task is offloaded to the
remote cloud.
THE ARCHITECTURE OF A MOBILE CLOUD COMPUTING
ENVIRONMENT.
Established as per the Section 2(f) of the UGC Act, 1956
Approved by AICTE, COA and BCI, New Delhi
UNIT-1
Big Data Science and Machine
Intelligence
School of Computer Science and Engineering
Prof. Sethumadhavi.R
[email protected]
AY: 2022-2023
BIG DATA ACQUISITION AND ANALYTICS EVOLUTION
1. Data science, data mining, data analytics and knowledge discovery are
closely related terms
2. These big data components form a big data value chain built up of
statistics, machine learning, biology and kernel methods.
3. Statistics cover both linear and logistic regression.
4. Decision trees are typical machine learning tools.
5. Biology refers to artificial neural networks, genetic algorithms and swarm
intelligence. Finally, the kernel method includes the use of support vector
machines.
1. Compared with traditional datasets, big data generally includes masses of
unstructured data that need more real-time analysis.
2. In addition, big data also brings about new opportunities for discovering
new values, helps us to gain an in-depth understanding of the hidden
values, and incurs new challenges.
3. example on how to effectively organize and manage such data. At
present, big data has attracted considerable interest from industry,
academia and government agencies.
4. The rapid growth of big data mainly comes from people’s daily life,
especially related to the Internet, Web and cloud services
1. Big data will have a huge and increasing potential in creating values for
businesses and consumers.
2. The most critical aspect of big data analytics is big data value.
We divide the value chain of big data into four phases:
Data generation,
Data acquisition,
Data storage and
Data analysis.
1. If we take data as a raw material, data generation and data acquisition are
exploitation processes, as data storage must use clouds or data centers.
2. Data analysis is a production process that utilizes the raw material to
create new value.
3. The rapid growth of cloud computing and IoT also triggers the sharp
growth of data. Cloud computing provides safeguarding, access sites and
channels for data assets.
4. In the paradigm of IoT, sensors worldwide are collecting and transmitting
data to be stored and processed in the cloud.
BIG DATA GENERATION
1. The major data types include Internet data, sensory data, etc.
2. This is the first step of big data. Given Internet data as an example, huge
amounts of data in terms of searching entries, Internet forum posts,
chatting records and microblog messages, are generated.
3. Those data are closely related to people’s daily lives, and have similar
features of high value and low density. Such Internet data may be
valueless individually but, through the exploitation of accumulated big
data, useful information such as habits and hobbies of users can be
identified, and it is even possible to forecast users’ behaviors and
emotional moods
DATA QUALITY CONTROL, REPRESENTATION AND DATABASE
MODEL
The quality control of big data involves a circular cycle of four stages:
i) we must identify the important data quality attributes;
ii) to access the data relies on the ability to measure or assess the data
quality level;
iii) then we must be able to analyze the data quality and their major causes;
and finally
iv) we need to improve the data quality by suggesting concrete actions to
take.
ATTRIBUTES FOR DATA QUALITY CONTROL, REPRESENTATION
AND DATABASE OPERATIONS.
BIG DATA ACQUISITION AND PRE-PROCESSING
Loading is the most complex procedure among the three, which includes operations
such as transformation, copy, clearing, standardization.
data integration methods are accompanied with flow processing engines and
search engines:
1) Data Selection: Select a target dataset or subset of data samples on which the
discovery is to be performed.
2) Data Transformation: Simplify the datasets by removing unwanted variables.
Then analyze useful features that can be used to represent the data, depending on
the goal or task.
3) Data Mining:
Searching for patterns of interest in a particular representational form or a
set of such representations as classification rules or trees, regression,
clustering, and so forth.
4) Evaluation and knowledge representation:
Evaluate knowledge pattern, and utilize visualization techniques to present
the knowledge vividly.
BIG DATA ACQUISITION
1. As the second phase, data acquisition also includes data collection, data
transmission and data pre-processing.
2. During big data acquisition, once we collect the raw data, we utilize an
efficient transmission mechanism to send it to a proper storage
management system to support different analytical applications.
3. The collected datasets may sometimes include much redundant or
useless data, which unnecessarily increases storage space and affects
the subsequent data analysis.
SOME BIG DATA ACQUISITION SOURCES AND MAJOR
PREPROCESSING OPERATIONS
LOG FILES
1. log files are record files automatically generated by the data source
system, so as to record activities in designated file formats for
subsequent analysis.
2. Log files are typically used in nearly all digital devices. For example, web
servers record in log files the number of clicks, click rates, visits, and
other property records of web users.
To capture activities of users at the websites, web servers mainly include the
following three log file formats:
public log file format (NCSA)
expanded log format (W3C)
IIS log format (Microsoft).
All three types of log files are in the ASCII text format.
SENSORS
Data mining and machine learning are closely related to each other. Data
mining is the computational process of discovering patterns in large datasets
involving methods at the intersection of artificial intelligence, machine
learning, statistics and database systems.
The overall goal of the data-mining process is to extract information from a
dataset and transform it into an understandable structure for further use.
Aside from the raw analysis step, it involves database and data management
aspects, data pre-processing, model and inference considerations,
interestingness metrics, complexity considerations, postprocessing of
discovered structures, visualization and online updating.
Machine learning explores the construction and study of algorithms that can
learn from and make predictions on data. Such algorithms operate by
building a model from example inputs in order to make data-driven
predictions or decisions, rather than following strictly static program
instructions.
Machine learning is closer to applications and end user.
It focuses on prediction, based on known properties learned from the training
data.
we divide machine learning techniques into three categories: i) supervised
learning such as regression model ,decision tree, etc.
ii) unsupervised learning, which includes clustering, anomaly detection, etc.
iii) other learning, such as reinforcement learning, transfer learning, active
learning and deep learning, etc
THE RELATIONSHIP OF DATA MINING AND MACHINE
LEARNING
DATA MINING TECHNIQUES ARE CLASSIFIED
INTO THREE CATEGORIES
Product layout optimization, customer trade analysis, product suggestions and market
structure analysis can be conducted by text analysis and website mining technologies.
The quantity of mobile phones and tablet PC first surpassed that of laptops and PCs in
2011. Mobile phones and Internet of Things based on sensors are opening a new
generation of innovation applications, and searching for larger capacity of supporting
location sensing, people oriented and context operation
NETWORK APPLICATIONS
The early Internet mainly provided email and webpage services.
Text analysis, data mining and webpage analysis technologies have been
applied to the mining of email content and building search engines.
Nowadays, most applications are web-based, regardless of their application
field and design goals.
Network data accounts for a major percentage of the global data volume.
Web has become the common platform for interconnected pages, full of
various kinds of data, such as text, images, videos, pictures and interactive
content, etc.
Different user groups may search for daily news and publish their opinions
with timely feedback.
BIG DATA IN SCIENTIFIC APPLICATIONS
In the finance community, the application of big data has grown rapidly in
recent years. For example, China Merchants Bank utilizes data analysis to
recognize that such activities as “Multi-times score accumulation” and
“score exchange in shops,” are effective for attracting quality customers.
By building a customer loss early warning model, the bank can sell high-yield
financial products to the top 20% customers in loss ratio so as to retain
them. As a result, the loss ratios of customers with Gold Cards and Sunflower
Cards have been reduced by 15% and 7%, respectively
EX-ALIBABA
The credit loan of Alibaba automatically analyzes and judges whether to
provide loans to business enterprises through the acquired enterprise
transaction data by virtue of big data technologies, while manual intervention
does not occur in the entire process.
It is disclosed that, so far, Alibaba has lent more than RMB 30 billion Yuan,
with the rate of bad loans at only about 0.3\%, which is a great deal lower
than those of other commercial banks.
HEALTHCARE AND MEDICAL APPLICATIONS
The healthcare industry is growing rapidly and medical data is a continuously
and rapidly growing complex data, containing abundant and various
information values.
Big data has unlimited potential for effectively storing, processing, querying
and analyzing medical data.
The application of medical big data will profoundly influence human health.
The IoT is revolutionizing the healthcare industry.
Sensors collect patient data, then microcontrollers process, analyze and
communicate the data over wireless Internet. Microprocessors enable rich
graphical user interfaces. Healthcare clouds and gateways help analyze the
data with statistical accuracy.
Microprocessors enable rich graphical user interfaces.
Healthcare clouds and gateways help analyze the data with statistical
accuracy.
COLLECTIVE INTELLIGENCE
With the rapid development of wireless communication and sensor
technologies, mobile phones and tablet computers have integrated more and
more sensors, with increasingly stronger computing and sensing capacities.
As a result, crowd sensing is taking to the center stage of mobile computing.
In crowd sensing, a large number of general users utilize mobile devices as
basic sensing units to conduct coordination with mobile networks for
distribution of sensed tasks and collection and utilization of sensed data.
The goal is to complete large-scale and complex social sensing tasks. In
crowd sensing, participants who complete complex sensing tasks do not
need to have professional skills.
Crowd sensing modes represented by Crowdsourcing have been successfully
applied to geotagged photograph, positioning and navigation, urban road
traffic sensing, market forecasting, opinion mining and other labor-intensive
applications.
Crowdsourcing, a new approach to problem solving, takes a large number of
general users as the foundation and distributes tasks in a free and voluntary
way.
Crowdsourcing can be useful for labor-intensive applications, such as picture
marking, language translation and speech recognition
The main idea of Crowdsourcing is to distribute tasks to general users and to
complete tasks that users could not individually complete or do not
anticipate to complete.
In the big data era, Spatial Crowdsourcing is a hot topic.
The operation framework of Spatial Crowdsourcing is shown as follows.
A user may request the service and resources related to a specified location.
Then the mobile users who are willing to participate in the task will move to
the specified location to acquire related data (i.e. video, audio or pictures).
Finally, the acquired data will be sent to the service requester. With the rapid
growth of mobile devices and the increasingly complex functions provided by
such devices, it is forecast that Spatial Crowdsourcing will be more prevalent
than traditional Crowdsourcing, for example Amazon Turk and Crowdflower
COGNITIVE COMPUTING – AN INTRODUCTION
The term cognitive computing is derived from cognitive science and artificial
intelligence. For years, we have wanted to build a “computer” that can
compute as well as learn by training, to achieve some human-like senses or
intelligence.
It has been called a “brain-inspired computer” or a “neural computer”. Such a
computer will be built with special hardware and/or software, which can
mimic basic human brain functions such as handling fuzzy information and
perform affective, dynamic and instant responses.
It can handle some ambiguity and uncertainty beyond traditional computers.
we want a cognitive machine that can model the human brain with the
cognitive power to learn, memorize, reason and respond to external stimulus,
autonomously and tirelessly. This field has been also called
“neuroinformatics”.
Cognitive computing hardware and applications could be more affective and
influential by design choices to make a new class of problems computable.
Such a system offers a synthesis, not just of information sources but of
influences, contexts and insights.
SYSTEM FEATURES OF COGNITIVE COMPUTING
cognitive system redefines the relationship between humans and their
pervasive digital environment.
They may play the role of assistant or coach for the user, and they may act
virtually autonomously in many situations.
The computing results of a cognitive system could be suggestive,
prescriptive or instructive in nature.
LISTED BELOW ARE SOME CHARACTERISTICS OF COGNITIVE
COMPUTING SYSTEMS:
Adaptive in learning: They may learn as information changes, and as goals
and requirements evolve. They may resolve ambiguity and tolerate
unpredictability. They may be engineered to feed on dynamic data in real
time, or near real time.
Interactive with users: Users can define their needs as a trainer of the
cognitive system. They may also interact with other processors, devices and
cloud services, as well as with people
Iterative and stateful: They may redefine a problem by asking questions or
finding additional source input if a problem statement is ambiguous or
incomplete. They may “remember” previous interactions iteratively.
Contextual in information discovery: They may understand, identify and
extract contextual elements such as meaning, syntax, time, location,
appropriate domain, regulations, user’s profile, process, task and goal. They
may draw on multiple sources of information, including both structured and
unstructured digital information, as well as sensory inputs such as visual,
gestural, auditory or sensor provided.
DIFFERENCES WITH CURRENT COMPUTERS
Cognitive systems differ from current computing applications in that they
move beyond tabulating and calculating based on preconfigured rules and
programs. Although they are capable of basic computing, they can also infer
and even reason based on broad objectives.
Cognitive computing systems can be extended to integrate or leverage
existing information systems and add domain or task-specific interfaces and
tools. Cognitive systems leverage today’s IT resources and coexist with
legacy systems into the future. The ultimate goal is to bring computing even
closer to human thinking and become a fundamental partnership in human
endeavour.
RELATED FIELDS TO NEUROINFORMATICS AND COGNITIVE
COMPUTING.
Cognitive science is interdisciplinary in nature. It covers the areas of
psychology artificial intelligence, neuroscience and linguistics, etc.
It spans many levels of analysis from low-level machine learning and
decision mechanisms to high-level neural circuitry to build brain-modeled
computers.
APPLICATIONS OF COGNITIVE MACHINE LEARNING
Cognitive computing platforms have emerged and become commercially
available, and evidence of real-world applications is starting to surface.
Organizations have adopted and used these cognitive computing platforms
for the purpose of developing applications to address specific use cases,
with each application utilizing some combination of available functionality.
Examples of such real-world cases include:
i) speech understanding;
ii) sentiment analysis;
iii) face recognition;
iv) election insights;
v) autonomous driving;
vi) deep learning applications.
Many more examples are available in cognitive computing services. These
demystify the possibilities into real-world applications.
MACHINE AND DEEP LEARNING APPLICATIONS CLASSIFIED IN
16 CATEGORIES.
Among these big data applications:
a) object recognition;
b) video interpretation;
c) image retrieval;
are related to machine vision applications.
1. Text and document tasks include:
a) fact extraction; b) machine translation; and c) text comprehension.
2.On the audio and emotion detection side, we have:
a) speech recognition; b) natural language processing, and c) sentiment
analysis tasks.
3) In medical or healthcare applications,
we have: a) cancer detection; b) drug discovery; c) toxicology and radiology;
and d) bioinformatics.
Additional information on cognitive machine learning applications can be
found on the youtube website: www.youtube.com/playlist?
list =PLjJh1vlSEYgvGod9wWiydum Yl8hOXixNu
In business and financial applications, we have (n) digital advertising, (o)
fraud detection and (p) sell and buy prediction in market analysis. Many of
these cognitive tasks are awaiting automation.