0% found this document useful (0 votes)
3 views

Data Mining Notes UNIT V

The document discusses the evolution and applications of data mining across various industries, including finance, retail, telecommunications, and biology. It highlights the significance of data mining in analyzing large datasets for purposes such as customer targeting, fraud detection, and scientific research. Additionally, it outlines trends in data mining, such as the integration of data mining with other systems, the need for scalable methods, and the challenges associated with multimedia and spatial data mining.

Uploaded by

Gayathri T
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Data Mining Notes UNIT V

The document discusses the evolution and applications of data mining across various industries, including finance, retail, telecommunications, and biology. It highlights the significance of data mining in analyzing large datasets for purposes such as customer targeting, fraud detection, and scientific research. Additionally, it outlines trends in data mining, such as the integration of data mining with other systems, the need for scalable methods, and the challenges associated with multimedia and spatial data mining.

Uploaded by

Gayathri T
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Applications and Trends in Data Mining

 Data mining has made broad and significant progress since its early beginnings in
the 1980s.
 Today, data mining is used in a vast array of areas, and numerous commercial data
mining systems are available.

Data Mining Applications

1.Data Mining for Financial Data Analysis

 Most banks and financial institutions offer a wide variety of banking services (such
as checking and savings accounts for business or individual customers), credit (such
as business, mortgage, and automobile loans), and investment services (such as
mutual funds).
 Some also offer insurance services and stock investment services.

i).Design and construction of data warehouses for multidimensional data


analysis and data mining: Like many other applications, data warehouses need to be
constructed for banking and financial data.
ii).Loan payment prediction and customer credit policy analysis: Loan payment
prediction and customer credit analysis are critical to the business of a bank.
iii).Classification and clustering of customers for targeted marketing:
Classification and clustering methods can be used for customer group identification and
targeted marketing.

iv).Detection of money laundering and other financial crimes: To detect money


laundering and other financial crimes, it is important to integrate information from
multiple databases

2.Data Mining for the Retail Industry


 The retail industry is a major application area for data mining, since it collects huge
amounts of data on sales, customer shopping history, goods transportation,
consumption, and service.
 The quantity of data collected continues to expand rapidly, especially due to the
increasing ease, availability, and popularity of business conducted on the Web or e-
commerce.

3.Data Mining for the Telecommunication Industry


 The telecommunication industry has quickly evolved from offering local and long
distance telephone services to providing many other comprehensive
communication services, including fax, pager, cellular phone, Internet messenger,
images, e-mail, computer and Web data transmission, and other data traffic.
 The integration of telecommunication, computer network, Internet, and numerous
other means of communication and computing is also underway.

4.Data Mining for Biological Data Analysis


 The past decade has seen an explosive growth in genomics, proteomics, functional
genomics, and biomedical research.
 Examples range from the identification and comparative analysis of the genomes of
human and other species (by discovering sequencing patterns, gene functions, and
evolution paths) to the investigation of genetic networks and protein pathways, and
the development of new pharmaceuticals and advances in cancer therapies.
 Biological data mining has become an essential part of a new research field called
bioinformatics.

5.Data Mining in Other Scientific Applications


i).Data warehouses and data preprocessing: Data warehouses are critical for
information exchange and data mining. In the area of geospatial data, however, no true
geospatial data warehouse exists today.
 Creating such a warehouse requires finding means for resolving geographic
and temporal data incompatibilities, such as reconciling semantics,
referencing systems, geometry, accuracy, and precision.
 For scientific applications in general, methods are needed for integrating
data from heterogeneous sources (such as data covering different time
periods) and for identifying events.
ii).Mining complex data types: Scientific data sets are heterogeneous in nature,
typically involving semi-structured and unstructured data, such as multimedia data and
geo referenced stream data.
 Robust methods are needed for handling spatiotemporal data, related
concept hierarchies, and complex geographic relationships.
iii).Graph-based mining : It is often difficult or impossible to model several
physical phenomena and processes due to limitations of existing modeling approaches.
iv).Visualization tools and domain-specific knowledge : High-level graphical user
interfaces and visualization tools are required for scientific data mining systems

6.Data Mining for Intrusion Detection

 The security of our computer systems and data is at continual risk.


 The extensive growth of the Internet and increasing availability of tools and
tricks for intruding and attacking networks have prompted intrusion
detection to become a critical component of network administration.
 An intrusion can be defined as any set of actions that threaten the integrity,
confidentiality, or availability of a network resource (such as user accounts,
file systems, system kernels, and so on).
 Most commercial intrusion detection systems are limiting and do not provide
a complete solution.
 Such systems typically employ a misuse detection strategy.
 Misuse detection searches for patterns of program or user behavior that
match known intrusion scenarios, which are stored as signatures
Trends in Data Mining
 The diversity of data, data mining tasks, and data mining approaches poses many
challenging research issues in data mining.
 The development of efficient and effective data mining methods and systems, the
construction of interactive and integrated data mining environments, the design of
data mining languages, and the application of data mining techniques to solve large
application problems are important tasks for data mining researchers and data
mining system and application developers.
 Some of the trends in data mining that reflect the pursuit of these challenges are,
 Application exploration
 Scalable and interactive data mining methods
 Integration of data mining with database systems, data warehouse
systems, and Web database systems
 Standardization of data mining language
 Visual data mining
 New methods for mining complex types of data
 Biological data mining
 Data mining and software engineering
 Web mining
 Distributed data mining
 Real-time or time-critical data mining
 Graph mining, link analysis, and social network analysis
 Multi relational and multi database data mining
 Privacy protection and information security in data mining

Spatial Data Mining

 A spatial database stores a large amount of space-related data, such as maps,


preprocessed remote sensing or medical imaging data, and VLSI chip layout data.
 Spatial databases have many features distinguishing them from relational
databases.
 They carry topological and/or distance information, usually organized by
sophisticated, multidimensional spatial indexing structures that are accessed by
spatial data access methods and often require spatial reasoning, geometric
computation, and spatial knowledge representation techniques.
 Spatial data mining refers to the extraction of knowledge, spatial relationships, or
other interesting patterns not explicitly stored in spatial databases.
 Such mining demands an integration of data mining with spatial database
technologies.
 It can be used for understanding spatial data, discovering spatial relationships and
relationships between spatial and nonspatial data, constructing spatial knowledge
bases, reorganizing spatial databases, and optimizing spatial queries.
 It is expected to have wide applications in geographic information systems, geo
marketing, remote sensing, image database exploration, medical imaging,
navigation, traffic control, environmental studies, and many other areas where
spatial data are used.
 A spatial data warehouse is a subject-oriented, integrated, time-variant , and
nonvolatile collection of both spatial and nonspatial data in support of spatial data
mining and spatial data related decision-making processes.
 There are several challenging issues regarding the construction and utilization of
spatial data warehouses.
i).The first challenge is the integration of spatial data from heterogeneous
sources and systems.
ii).The second challenge is the realization of fast and flexible on-line
analytical processing in spatial data warehouses
 In a spatial warehouse both dimensions and measures may contain spatial
components.
 There are three types of dimensions in a spatial data cube:
 A nonspatial dimension
 A spatial-to-nonspatial dimension
 A spatial-to-spatial dimension

 We distinguish two types of measures in a spatial data cube:


 A numerical measure -contains only numerical data.
 A spatial measure -contains a collection of pointers to spatial
objects.
 Spatial classification analyzes spatial objects to derive classification schemes in
relevance to certain spatial properties, such as the neighborhood of a district,
highway, or river.

Multimedia Data Mining:

What is Multimedia Data Mining?

Multimedia mining is a subfield of data mining that is used to find interesting information
of implicit knowledge from multimedia databases. Mining in multimedia is referred to as
automatic annotation or annotation mining. Mining multimedia data requires two or more
data types, such as text and video or text video and audio.

Multimedia data mining is an interdisciplinary field that integrates image processing and
understanding, computer vision, data mining, and pattern recognition. Multimedia data
mining discovers interesting patterns from multimedia databases that store and manage
large collections of multimedia objects, including image data, video data, audio data,
sequence data and hypertext data containing text, text markups, and linkages. Issues in
multimedia data mining include content-based retrieval and similarity search, generalization
and multidimensional analysis. Multimedia data cubes contain additional dimensions and
measures for multimedia information.

The framework that manages different types of multimedia data stored, delivered, and
utilized in different ways is known as a multimedia database management system. There
are three classes of multimedia databases: static, dynamic, and dimensional media. The
content of the Multimedia Database management system is as follows:
o Media data:The actual data representing an object.
o Media format data: Information such as sampling rate, resolution, encoding
scheme etc., about the format of the media data after it goes through the acquisition,
processing and encoding phase.
o Media keyword data:Keywords description relating to the generation of data. It is
also known as content descriptive data. Example: date, time and place of recording.
o Media feature data: Content dependent data such as the distribution of colours,
kinds of texture and different shapes present in data.

Types of Multimedia Applications

Types of multimedia applications based on data management characteristics are:

1. Repository applications: A Large amount of multimedia data and meta-data


(Media format date, Media keyword data, Media feature data) that is stored for
retrieval purposes, e.g., Repository of satellite images, engineering drawings,
radiology scanned pictures.
2. Presentation applications: They involve delivering multimedia data subject to
temporal constraints. Optimal viewing or listening requires DBMS to deliver data at
a certain rate, offering the quality of service above a certain threshold. Here data is
processed as it is delivered. Example: Annotating of video and audio data, real-time
editing analysis.
3. Collaborative work using multimedia information involves executing a complex
task by merging drawings and changing notifications. Example: Intelligent
healthcare network.

Challenges with Multimedia Database

There are still many challenges to multimedia databases, such as:


1. Modelling: Working in this area can improve database versus information retrieval
techniques; thus, documents constitute a specialized area and deserve special
consideration.
2. Design:The conceptual, logical and physical design of multimedia databases has not
yet been addressed fully as performance and tuning issues at each level are far more
complex as they consist of a variety of formats like JPEG, GIF, PNG, MPEG, which is
not easy to convert from one form to another.
3. Storage:Storage of multimedia database on any standard disk presents the problem
of representation, compression, mapping to device hierarchies, archiving and
buffering during input-output operation. In DBMS, a BLOB (Binary Large Object)
facility allows untyped bitmaps to be stored and retrieved.
4. Performance: Physical limitations dominate an application involving video
playback or audio-video synchronization. The use of parallel processing may
alleviate some problems, but such techniques are not yet fully developed. Apart
from this, a multimedia database consumes a lot of processing time and bandwidth.
5. Queries and retrieval: For multimedia data like images, video, and audio accessing
data through query open up many issues like efficient query formulation, query
execution and optimization, which need to be worked upon.

Where is Multimedia Database Applied?

Below are the following areas where a multimedia database is applied, such as:

o Documents and record management: Industries and businesses keep detailed


records and various documents. For example, insurance claim records.
o Knowledge dissemination:Multimedia database is a very effective tool for
knowledge dissemination in terms of providing several resources. For example,
electronic books.
o Education and training:Computer-aided learning materials can be designed using
multimedia sources which are nowadays very popular sources of learning. Example:
Digital libraries.
o Travelling: Marketing, advertising, retailing, entertainment and travel. For
example, a virtual tour of cities.
o Real-time control and monitoring: With active database technology, multimedia
presentation of information can effectively monitor and control complex tasks. For
example, manufacturing operation control.

Categories of Multimedia Data Mining

Multimedia mining refers to analyzing a large amount of multimedia information to extract


patterns based on their statistical relationships. Multimedia data mining is classified into
two broad categories: static and dynamic media. Static media contains text (digital library,
creating SMS and MMS) and images (photos and medical images). Dynamic media contains
Audio (music and MP3 sounds) and Video (movies). The below image shows the categories
of multimedia data mining.

1. Text Mining

Text is the foremost general medium for the proper exchange of information. Text Mining
evaluates a huge amount of usual language text and detects exact patterns to find useful
information. Text Mining also referred to as text data mining, is used to find meaningful
information from unstructured texts from various sources.

2. Image Mining

Image mining systems can discover meaningful information or image patterns from a huge
collection of images. Image mining determines how low-level pixel representation consists
of a raw image or image sequence that can be handled to recognize high-level spatial
objects and relationships. It includes digital image processing, image understanding,
database, AI, etc.

3. Video Mining

Video mining is unsubstantiated to find interesting patterns from many video data;
multimedia data is video data such as text, image, metadata, visuals and audio. It is
commonly used in security and surveillance, entertainment, medicine, sports and
education programs. The processing is indexing, automatic segmentation, content-based
retrieval, classification and detecting triggers.

4. Audio Mining

Audio mining plays an important role in multimedia applications, is a technique by which


the content of an audio signal can be automatically searched, analyzed and rotten with
wavelet transformation. It is generally used in automatic speech recognition, where the
analysis efforts to find any speech within the audio. Band energy, frequency centroid, zero-
crossing rate, pitch period and bandwidth are often used for audio processing.
Application of Multimedia Mining

There are different kinds of applications of multimedia data mining, some of which are as
follows:

o Digital Library: The collection of digital data is stored and maintained in a digital
library, which is essential to convert different digital data formats into text, images,
video, audio, etc.
o Traffic Video Sequences: To determine important but previously unidentified
knowledge from the traffic video sequences, detailed analysis and mining are to be
performed based on vehicle identification, traffic flow, and queue temporal relations
of the vehicle at an intersection. This provides an economic approach for regular
traffic monitoring processes.
o Medical Analysis: Multimedia mining is primarily used in the medical field,
particularly for analyzing medical images. Various data mining techniques are used
for image classification. Examples, Automatic 3D delineation of highly aggressive
brain tumours, Automatic localization and identification of vertebrae in 3D CT scans,
MRI Scans, ECG and X-Ray.
o Customer Perception: It contains details about customers' opinions, products or
services, customers complaints, customers preferences, and the level of customer
satisfaction with products or services, which are collected together. The audio data
serve as topic detection, resource assignment and evaluation of the quality of
services. Many companies have call centres that receive telephone calls from
customers.
o Media Making and Broadcasting: Radio stations and TV channels create
broadcasting companies, and multimedia mining can be applied to monitor their
content to search for more efficient approaches and improve their quality.
o Surveillance system: It consists of collecting, analyzing, summarizing audio, video
or audiovisual information about specific areas like government organizations,
multi-national companies, shopping malls, banks, forests, agricultural areas and,
highways etc. The main use of this technology in the field of security; hence it can be
utilized by military, police and private companies since they provide security
services.
Process of Multimedia Data Mining

The below image shows the present architecture, which includes the types of the
multimedia mining process. Data Collection is the initial stage of the learning system; Pre-
processing is to extract significant features from raw data. It includes data cleaning,
transformation, normalization, feature extraction, etc. Learning can be direct if informative
types can be recognized at preprocessing stage. The complete process depends extremely
on the nature of raw data and the difficulty field. The product of preprocessing is the
training set. A learning model must be selected for the specified training set to learn from it
and make the multimedia model more constant.

Converting Un-structured data to structured data: Data resides in a fixed field within a
record or file is called structured data, and these data are stored in sequential form.
Structured data has been easily entered, stored, queried and analyzed. Unstructured data is
bitstream, for example, pixel representation for an image, audio, video and character
representation for text. These files may have an internal structure, but they are still
considered "unstructured" because their data does not fit neatly in a database. For
example, images and videos of different objects have some similarities - each represents an
interpretation of a building without a clear structure.
Current data mining tools operate on structured data, which resides in a huge volume of
the relational database, while data in multimedia databases are semi-structured or
unstructured. Hence, the semi-structured or unstructured multimedia data is converted
into structured one, and then the current data mining tools are used to extract the
knowledge. The sequence or time element is different between unstructured and
structured data mining. The architecture of converting unstructured data to structured
data and which is used for extracting information from the unstructured database, is
shown in the above image. Then data mining tools are applied to the stored structured
databases.

Architecture for Multimedia Data Mining

Multimedia mining architecture is given in the below image. The architecture has several
components. Important components are Input, Multimedia Content, Spatiotemporal
Segmentation, Feature Extraction, Finding similar Patterns, and Evaluation of Results.

1. The input stage comprises a multimedia database used to find the patterns and
perform the data mining.
2. Multimedia Content is the data selection stage that requires the user to select the
databases, subset of fields, or data for data mining.
3. Spatio-temporal segmentation is nothing but moving objects in image sequences
in the videos, and it is useful for object segmentation.
4. Feature extraction is the preprocessing step that involves integrating data from
various sources and making choices regarding characterizing or coding certain data
fields to serve when inputs to the pattern-finding stage. Such representation of
choices is required because certain fields could include data at various levels and
are not considered for finding a similar pattern stage. In MDM, the preprocessing
stage is significant since the unstructured nature of multimedia records.
5. Finding a similar pattern stage is the heart of the whole data mining process. The
hidden patterns and trends in the data are basically uncovered in this stage. Some
approaches to finding similar pattern stages contain association, classification,
clustering, regression, time-series analysis and visualization.
6. Evaluation of Results is a data mining process used to evaluate the results, and this
is important to determine whether the prior stage must be revisited or not. This
stage consists of reporting and using the extracted knowledge to produce new
actions, products, services, or marketing strategies.

Data Mining- World Wide Web

Over the last few years, the World Wide Web has become a significant source of
information and simultaneously a popular platform for business. Web mining can define as
the method of utilizing data mining techniques and algorithms to extract useful information
directly from the web, such as Web documents and services, hyperlinks, Web content, and
server logs. The World Wide Web contains a large amount of data that provides a rich
source to data mining. The objective of Web mining is to look for patterns in Web data by
collecting and examining data in order to gain insights.

What is Web Mining?

Web mining can widely be seen as the application of adapted data mining techniques to the
web, whereas data mining is defined as the application of the algorithm to discover
patterns on mostly structured data embedded into a knowledge discovery process. Web
mining has a distinctive property to provide a set of various data types. The web has
multiple aspects that yield different approaches for the mining process, such as web pages
consist of text, web pages are linked via hyperlinks, and user activity can be monitored via
web server logs. These three features lead to the differentiation between the three areas
are web content mining, web structure mining, web usage mining.

Types of Web Mining:

There are three types of web data mining:


1. Web Content Mining:

Web content mining can be used to extract useful data, information, knowledge from the
web page content. In web content mining, each web page is considered as an individual
document. The individual can take advantage of the semi-structured nature of web pages,
as HTML provides information that concerns not only the layout but also logical structure.
The primary task of content mining is data extraction, where structured data is extracted
from unstructured websites. The objective is to facilitate data aggregation over various
web sites by using the extracted structured data. Web content mining can be utilized to
distinguish topics on the web. For Example, if any user searches for a specific task on the
search engine, then the user will get a list of suggestions.

2. Web Structured Mining:

The web structure mining can be used to find the link structure of hyperlink. It is used to
identify that data either link the web pages or direct link network. In Web Structure
Mining, an individual considers the web as a directed graph, with the web pages being the
vertices that are associated with hyperlinks. The most important application in this regard
is the Google search engine, which estimates the ranking of its outcomes primarily with the
PageRank algorithm. It characterizes a page to be exceptionally relevant when frequently
connected by other highly related pages. Structure and content mining methodologies are
usually combined. For example, web structured mining can be beneficial to organizations
to regulate the network between two commercial sites.

3. Web Usage Mining:

Web usage mining is used to extract useful data, information, knowledge from the weblog
records, and assists in recognizing the user access patterns for web pages. In Mining, the
usage of web resources, the individual is thinking about records of requests of visitors of a
website, that are often collected as web server logs. While the content and structure of the
collection of web pages follow the intentions of the authors of the pages, the individual
requests demonstrate how the consumers see these pages. Web usage mining may disclose
relationships that were not proposed by the creator of the pages.

Some of the methods to identify and analyze the web usage patterns are given below:
I. Session and visitor analysis:

The analysis of preprocessed data can be accomplished in session analysis, which


incorporates the guest records, days, time, sessions, etc. This data can be utilized to analyze
the visitor's behavior.

The document is created after this analysis, which contains the details of repeatedly visited
web pages, common entry, and exit.

II. OLAP (Online Analytical Processing):

 OLAP accomplishes a multidimensional analysis of advanced data.


 OLAP can be accomplished on various parts of log related data in a specific period.
 OLAP tools can be used to infer important business intelligence metrics

Challenges in Web Mining:

The web pretends incredible challenges for resources, and knowledge discovery based on
the following observations:

o The complexity of web pages:

The site pages don't have a unifying structure. They are extremely complicated as
compared to traditional text documents. There are enormous amounts of documents in the
digital library of the web. These libraries are not organized according to a specific order.

o The web is a dynamic data source:

The data on the internet is quickly updated. For example, news, climate, shopping, financial
news, sports, and so on.

o Diversity of client networks:

The client network on the web is quickly expanding. These clients have different interests,
backgrounds, and usage purposes. There are over a hundred million workstations that are
associated with the internet and still increasing tremendously.

o Relevancy of data:

It is considered that a specific person is generally concerned about a small portion of the
web, while the rest of the segment of the web contains the data that is not familiar to the
user and may lead to unwanted results.

o The web is too broad:

The size of the web is tremendous and rapidly increasing. It appears that the web is too
huge for data warehousing and data mining.
Mining the Web's Link Structures to recognize Authoritative Web Pages:

The web comprises of pages as well as hyperlinks indicating from one to another page.
When a creator of a Web page creates a hyperlink showing another Web page, this can be
considered as the creator's authorization of the other page. The unified authorization of a
given page by various creators on the web may indicate the significance of the page and
may naturally prompt the discovery of authoritative web pages. The web linkage data
provide rich data about the relevance, the quality, and structure of the web's content, and
thus is a rich source of web mining.

Application of Web Mining:

Web mining has an extensive application because of various uses of the web. The list of
some applications of web mining is given below.

o Marketing and conversion tool


o Data analysis on website and application accomplishment.
o Audience behavior analysis
o Advertising and campaign accomplishment analysis.
o Testing and analysis of a site.

Data Mining Applications:

Data Mining is primarily used today by companies with a strong consumer focus —
retail, financial, communication, and marketing organizations, to “drill down” into their
transactional data and determine pricing, customer preferences and product
positioning, impact on sales, customer satisfaction and corporate profits. With data
mining, a retailer can use point-of-sale records of customer purchases to develop
products and promotions to appeal to specific customer segments.

14 areas where data mining is widely used

Here is the list of 14 other important areas where data mining is widely used:

Future Healthcare

o Data mining holds great potential to improve health systems. It uses data and
analytics to identify best practices that improve care and reduce costs. Researchers
use data mining approaches like multi-dimensional databases, machine learning,
soft computing, data visualization and statistics. Mining can be used to predict the
volume of patients in every category. Processes are developed that make sure that
the patients receive appropriate care at the right place and at the right time. Data
mining can also help healthcare insurers to detect fraud and abuse.

Market Basket Analysis

o Market basket analysis is a modelling technique based upon a theory that if you buy
a certain group of items you are more likely to buy another group of items. This
technique may allow the retailer to understand the purchase behaviour of a buyer.
This information may help the retailer to know the buyer’s needs and change the
store’s layout accordingly. Using differential analysis comparison of results between
different stores, between customers in different demographic groups can be done.

Education

o There is a new emerging field, called Educational Data Mining, concerns with
developing methods that discover knowledge from data originating from
educational Environments. The goals of EDM are identified as predicting students’
future learning behaviour, studying the effects of educational support, and
advancing scientific knowledge about learning. Data mining can be used by an
institution to take accurate decisions and also to predict the results of the student.
With the results the institution can focus on what to teach and how to teach.
Learning pattern of the students can be captured and used to develop techniques to
teach them.

Manufacturing Engineering

o Knowledge is the best asset a manufacturing enterprise would possess. Data mining
tools can be very useful to discover patterns in complex manufacturing process.
Data mining can be used in system-level designing to extract the relationships
between product architecture, product portfolio, and customer needs data. It can
also be used to predict the product development span time, cost, and dependencies
among other tasks.
CRM

o Customer Relationship Management is all about acquiring and retaining customers,


also improving customers’ loyalty and implementing customer focused strategies.
To maintain a proper relationship with a customer a business need to collect data
and analyse the information. This is where data mining plays its part. With data
mining technologies the collected data can be used for analysis. Instead of being
confused where to focus to retain customer, the seekers for the solution get filtered
results.

Fraud Detection

o Billions of dollars have been lost to the action of frauds. Traditional methods of
fraud detection are time consuming and complex. Data mining aids in providing
meaningful patterns and turning data into information. Any information that is valid
and useful is knowledge. A perfect fraud detection system should protect
information of all the users. A supervised method includes collection of sample
records. These records are classified fraudulent or non-fraudulent. A model is built
using this data and the algorithm is made to identify whether the record is
fraudulent or not.

Intrusion Detection

o Any action that will compromise the integrity and confidentiality of a resource is an
intrusion. The defensive measures to avoid an intrusion includes user
authentication, avoid programming errors, and information protection. Data mining
can help improve intrusion detection by adding a level of focus to anomaly
detection. It helps an analyst to distinguish an activity from common everyday
network activity. Data mining also helps extract data which is more relevant to the
problem.

Lie Detection

o Apprehending a criminal is easy whereas bringing out the truth from him is difficult.
Law enforcement can use mining techniques to investigate crimes, monitor
communication of suspected terrorists. This filed includes text mining also. This
process seeks to find meaningful patterns in data which is usually unstructured text.
The data sample collected from previous investigations are compared and a model
for lie detection is created. With this model processes can be created according to
the necessity.

Customer Segmentation

o Traditional market research may help us to segment customers but data mining
goes in deep and increases market effectiveness. Data mining aids in aligning the
customers into a distinct segment and can tailor the needs according to the
customers. Market is always about retaining the customers. Data mining allows to
find a segment of customers based on vulnerability and the business could offer
them with special offers and enhance satisfaction.

Financial Banking

o With computerised banking everywhere huge amount of data is supposed to be


generated with new transactions. Data mining can contribute to solving business
problems in banking and finance by finding patterns, causalities, and correlations in
business information and market prices that are not immediately apparent to
managers because the volume data is too large or is generated too quickly to screen
by experts. The managers may find these information for better
segmenting,targeting, acquiring, retaining and maintaining a profitable customer.

Corporate Surveillance

o Corporate surveillance is the monitoring of a person or group’s behaviour by a


corporation. The data collected is most often used for marketing purposes or sold to
other corporations, but is also regularly shared with government agencies. It can be
used by the business to tailor their products desirable by their customers. The data
can be used for direct marketing purposes, such as the targeted advertisements on
Google and Yahoo, where ads are targeted to the user of the search engine by
analyzing their search history and emails.

Research Analysis
o History shows that we have witnessed revolutionary changes in research. Data
mining is helpful in data cleaning, data pre-processing and integration of databases.
The researchers can find any similar data from the database that might bring any
change in the research. Identification of any co-occurring sequences and the
correlation between any activities can be known. Data visualisation and visual data
mining provide us with a clear view of the data.

Criminal Investigation

o Criminology is a process that aims to identify crime characteristics. Actually crime


analysis includes exploring and detecting crimes and their relationships with
criminals. The high volume of crime datasets and also the complexity of
relationships between these kinds of data have made criminology an appropriate
field for applying data mining techniques. Text based crime reports can be
converted into word processing files. These information can be used to perform
crime matching process.

Bio Informatics

o Data Mining approaches seem ideally suited for Bioinformatics, since it is data-rich.
Mining biological data helps to extract useful knowledge from massive datasets
gathered in biology, and in other related life sciences areas such as medicine and
neuroscience. Applications of data mining to bioinformatics include gene finding,
protein function inference, disease diagnosis, disease prognosis, disease treatment
optimization, protein and gene interaction network reconstruction, data cleansing,
and protein sub-cellular location prediction.

You might also like