0% found this document useful (0 votes)
3 views

L_1 Data Mining

The document outlines a comprehensive data mining course covering both basic and advanced concepts, emphasizing its importance for entrepreneurs and researchers in extracting valuable information from large datasets. It details the data mining process, types of data mining, advantages and disadvantages, applications in various fields, and challenges faced during implementation. Additionally, it highlights prerequisites for learning data mining and the intended audience for the course.

Uploaded by

Sanya Dixit
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

L_1 Data Mining

The document outlines a comprehensive data mining course covering both basic and advanced concepts, emphasizing its importance for entrepreneurs and researchers in extracting valuable information from large datasets. It details the data mining process, types of data mining, advantages and disadvantages, applications in various fields, and challenges faced during implementation. Additionally, it highlights prerequisites for learning data mining and the intended audience for the course.

Uploaded by

Sanya Dixit
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Data Mining

 The data mining course provides basic and advanced


concepts of data mining.
 Our data mining course is designed for learners and experts.
 Data mining is one of the most useful techniques that help
entrepreneurs, researchers, and individuals to extract
valuable information from huge sets of data.
 Data mining is also called Knowledge Discovery in
Database (KDD).
 The knowledge discovery process includes Data cleaning,
Data integration, Data selection, Data transformation, Data
mining, Pattern evaluation, and Knowledge presentation.
 Our Data mining course includes all topics of Data mining
such as applications, Data mining vs Machine learning, Data
mining tools, Social Media Data mining, Data mining
techniques, Clustering in data mining, Challenges in Data
mining, etc.

What is Data Mining?


The process of extracting information to identify patterns,
trends, and useful data that would allow the business to take the
data-driven decision from huge sets of data is called Data
Mining.
 In other words, we can say that Data Mining is the process
of investigating hidden patterns of information to various
perspectives for categorization into useful data, which is
collected and assembled in particular areas such as data
warehouses, efficient analysis, data mining algorithm,
helping decision making and other data requirement to
eventually cost-cutting and generating revenue.
 Data mining is the act of automatically searching for large
stores of information to find trends and patterns that go
beyond simple analysis procedures.
 Data mining utilizes complex mathematical algorithms for
data segments and evaluates the probability of future events.
 Data Mining is also called Knowledge Discovery of Data
(KDD).
 Data Mining is a process used by organizations to extract
specific data from huge databases to solve business
problems. It primarily turns raw data into useful information.
 Data Mining is similar to Data Science carried out by a
person, in a specific situation, on a particular data set, with
an objective.
 This process includes various types of services such as text
mining, web mining, audio and video mining, pictorial data
mining, and social media mining.
 It is done through software that is simple or highly specific.
By outsourcing data mining, all the work can be done faster
with low operation costs.
 Specialized firms can also use new technologies to collect
data that is impossible to locate manually.
 There are tonnes of information available on various
platforms, but very little knowledge is accessible.
 The biggest challenge is to analyze the data to extract
important information that can be used to solve a problem or
for company development.
 There are many powerful instruments and techniques
available to mine data and find better insight from it.

Types of Data Mining


 Data mining can be performed on the following types of
data:
Relational Database:
 A relational database is a collection of multiple data sets
formally organized by tables, records, and columns from
which data can be accessed in various ways without having
to recognize the database tables.
 Tables convey and share information, which facilitates data
searchability, reporting, and organization.
Data warehouses:
 A Data Warehouse is the technology that collects the data
from various sources within the organization to provide
meaningful business insights.
 The huge amount of data comes from multiple places such
as Marketing and Finance.
 The extracted data is utilized for analytical purposes and
helps in decision- making for a business organization.
 The data warehouse is designed for the analysis of data rather
than transaction processing.
Data Repositories:
 The Data Repository generally refers to a destination for data
storage.
 However, many IT professionals utilize the term more
clearly to refer to a specific kind of setup within an IT
structure.
 For example, a group of databases, where an organization
has kept various kinds of information.
Object-Relational Database:
 A combination of an object-oriented database model and
relational database model is called an object-relational
model.
 It supports Classes, Objects, Inheritance, etc.
 One of the primary objectives of the Object-relational data
model is to close the gap between the Relational database
and the object-oriented model practices frequently utilized in
many programming languages, for example, C++, Java, C#,
and so on.
Transactional Database:
 A transactional database refers to a database management
system (DBMS) that has the potential to undo a database
transaction if it is not performed appropriately.
 Even though this was a unique capability a very long while
back, today, most of the relational database systems support
transactional database activities.

Advantages of Data Mining


 The Data Mining technique enables organizations to obtain
knowledge-based data.
 Data mining enables organizations to make lucrative
modifications in operation and production.
 Compared with other statistical data applications, data
mining is a cost-efficient.
 Data Mining helps the decision-making process of an
organization.
 It Facilitates the automated discovery of hidden patterns as
well as the prediction of trends and behaviors.
 It can be induced in the new system as well as the existing
platforms.
 It is a quick process that makes it easy for new users to
analyze enormous amounts of data in a short time.
Disadvantages of Data Mining
 There is a probability that the organizations may sell useful
data of customers to other organizations for money.
 As per the report, American Express has sold credit card
purchases of their customers to other organizations.
 Many data mining analytics software is difficult to operate
and needs advance training to work on.
 Different data mining instruments operate in distinct ways
due to the different algorithms used in their design.
 Therefore, the selection of the right data mining tools is a
very challenging task.
 The data mining techniques are not precise, so that it may
lead to severe consequences in certain conditions.

Data Mining Applications


 Data Mining is primarily used by organizations with intense
consumer demands- Retail, Communication, Financial,
marketing company, determine price, consumer preferences,
product positioning, and impact on sales, customer
satisfaction, and corporate profits.
 Data mining enables a retailer to use point-of-sale records of
customer purchases to develop products and promotions that
help the organization to attract the customer.
 These are the following areas where data mining is widely
used:
Data Mining in Healthcare:
 Data mining in healthcare has excellent potential to improve
the health system.
 It uses data and analytics for better insights and to identify
best practices that will enhance health care services and
reduce costs.
 Analysts use data mining approaches such as Machine
learning, Multi-dimensional database, Data visualization,
Soft computing, and statistics.
 Data Mining can be used to forecast patients in each
category.
 The procedures ensure that the patients get intensive care at
the right place and at the right time.
 Data mining also enables healthcare insurers to recognize
fraud and abuse.
Data Mining in Market Basket Analysis:
 Market basket analysis is a modeling method based on a
hypothesis.
 If you buy a specific group of products, then you are more
likely to buy another group of products.
 This technique may enable the retailer to understand the
purchase behavior of a buyer.
 This data may assist the retailer in understanding the
requirements of the buyer and altering the store's layout
accordingly.
 Using a different analytical comparison of results between
various stores, between customers in different demographic
groups can be done.
Data mining in Education:
 Education data mining is a newly emerging field, concerned
with developing techniques that explore knowledge from the
data generated from educational Environments.
 EDM objectives are recognized as affirming student's future
learning behavior, studying the impact of educational
support, and promoting learning science.
 An organization can use data mining to make precise
decisions and also to predict the results of the student.
 With the results, the institution can concentrate on what to
teach and how to teach.
Data Mining in Manufacturing Engineering:
 Knowledge is the best asset possessed by a manufacturing
company.
 Data mining tools can be beneficial to find patterns in a
complex manufacturing process.
 Data mining can be used in system-level designing to obtain
the relationships between product architecture, product
portfolio, and data needs of the customers.
 It can also be used to forecast the product development
period, cost, and expectations among the other tasks.
Data Mining in CRM (Customer Relationship
Management):
 Customer Relationship Management (CRM) is all about
obtaining and holding Customers, also enhancing customer
loyalty and implementing customer-oriented strategies.
 To get a decent relationship with the customer, a business
organization needs to collect data and analyze the data.
 With data mining technologies, the collected data can be
used for analytics.
Data Mining in Fraud detection:
 Billions of dollars are lost to the action of frauds.
 Traditional methods of fraud detection are a little bit time
consuming and sophisticated.
 Data mining provides meaningful patterns and turning data
into information.
 An ideal fraud detection system should protect the data of all
the users.
 Supervised methods consist of a collection of sample
records, and these records are classified as fraudulent or non-
fraudulent.
 A model is constructed using this data, and the technique is
made to identify whether the document is fraudulent or not.

Data Mining in Lie Detection:


 Apprehending a criminal is not a big deal, but bringing out
the truth from him is a very challenging task.
 Law enforcement may use data mining techniques to
investigate offenses, monitor suspected terrorist
communications, etc.
 This technique includes text mining also, and it seeks
meaningful patterns in data, which is usually unstructured
text.
 The information collected from the previous investigations
is compared, and a model for lie detection is constructed.
Data Mining Financial Banking:
 The Digitalization of the banking system is supposed to
generate an enormous amount of data with every new
transaction.
 The data mining technique can help bankers by solving
business-related problems in banking and finance by
identifying trends, casualties, and correlations in business
information and market costs that are not instantly evident to
managers or executives because the data volume is too large
or are produced too rapidly on the screen by experts.
 The manager may find these data for better targeting,
acquiring, retaining, segmenting, and maintain a profitable
customer.
Challenges of Implementation in Data mining
 Although data mining is very powerful, it faces many
challenges during its execution.
 Various challenges could be related to performance, data,
methods, and techniques, etc.
 The process of data mining becomes effective when the
challenges or problems are correctly recognized and
adequately resolved.
Incomplete and noisy data:
 The process of extracting useful data from large volumes of
data is data mining. The data in the real-world is
heterogeneous, incomplete, and noisy. Data in huge
quantities will usually be inaccurate or unreliable. These
problems may occur due to data measuring instrument or
because of human errors. Suppose a retail chain collects
phone numbers of customers who spend more than $ 500,
and the accounting employees put the information into their
system. The person may make a digit mistake when entering
the phone number, which results in incorrect data. Even
some customers may not be willing to disclose their phone
numbers, which results in incomplete data. The data could
get changed due to human or system error. All these
consequences (noisy and incomplete data)makes data mining
challenging.

Data Distribution:
 Real-worlds data is usually stored on various platforms in a
distributed computing environment. It might be in a
database, individual systems, or even on the internet.
Practically, It is a quite tough task to make all the data to a
centralized data repository mainly due to organizational and
technical concerns. For example, various regional offices
may have their servers to store their data. It is not feasible to
store, all the data from all the offices on a central server.
Therefore, data mining requires the development of tools and
algorithms that allow the mining of distributed data.
Complex Data:
 Real-world data is heterogeneous, and it could be
multimedia data, including audio and video, images,
complex data, spatial data, time series, and so on. Managing
these various types of data and extracting useful information
is a tough task. Most of the time, new technologies, new
tools, and methodologies would have to be refined to obtain
specific information.
Performance:
 The data mining system's performance relies primarily on the
efficiency of algorithms and techniques used. If the designed
algorithm and techniques are not up to the mark, then the
efficiency of the data mining process will be affected
adversely.
Data Privacy and Security:
 Data mining usually leads to serious issues in terms of data
security, governance, and privacy. For example, if a retailer
analyzes the details of the purchased items, then it reveals
data about buying habits and preferences of the customers
without their permission.
Data Visualization:
 In data mining, data visualization is a very important process
because it is the primary method that shows the output to the
user in a presentable way. The extracted data should convey
the exact meaning of what it intends to express. But many
times, representing the information to the end-user in a
precise and easy way is difficult. The input data and the
output information being complicated, very efficient, and
successful data visualization processes need to be
implemented to make it successful.
 There are many more challenges in data mining in addition
to the problems above-mentioned. More problems are
disclosed as the actual data mining process begins, and the
success of data mining relies on getting rid of all these
difficulties.
 Prerequisites
 Before learning the concepts of Data Mining, you should
have a basic understanding of Statistics, Database
Knowledge, and Basic programming language.
 Audience
 Our Data Mining Tutorial is prepared for all beginners or
computer science graduates to help them learn the basics to
advanced techniques related to data mining.
 Problems
 We assure you that you will not find any difficulty while
learning our Data Mining tutorial. But if there is any mistake
in this tutorial, kindly post the problem or error in the contact
form so that we can improve it.

You might also like