Data Mining
Data Mining
NAME : SANTHOSHKUMAR .S
REGNO : 2021503302
NAME : RITHICK.S 1
REGNO : 2021503040
DATA MINING- WORLD WIDE WEB
Over the last few years, the World Wide Web has become a significant source of information and simultaneously
a popular platform for business.
Web mining can define as the method of utilizing data mining techniques and algorithms to extract useful
information directly from the web, such as Web documents and services, hyperlinks, Web content, and server logs.
The World Wide Web contains a large amount of data that provides a rich source to data mining.
The objective of Web mining is to look for patterns in Web data by collecting and examining data in order to gain
insights
2
WHAT IS WEB MINING?
Web mining can widely be seen as the application of adapted data mining techniques to the web, whereas data
mining is defined as the application of the algorithm to discover patterns on mostly structured data embedded into
a knowledge discovery process.
Web mining has a distinctive property to provide a set of various data types. The web has multiple aspects that
yield different approaches for the mining process, such as web pages consist of text, web pages are linked via
hyperlinks, and user activity can be monitored via web server logs.
These three features lead to the differentiation between the three areas are web content mining, web structure
mining, web usage mining.
3
PROCESS OF WEB MINING
Web mining can be broadly divided into three different types of techniques of mining: Web Content
Mining, Web Structure Mining, and Web Usage Mining. These are explained as following below.
4
THERE ARE THREE TYPES OF DATA MINING:
5
1. WEB CONTENT MINING:
Web content mining can be used to extract useful data, information, knowledge from the web page content.
In web content mining, each web page is considered as an individual document. The individual can take advantage
of the semi-structured nature of web pages, as HTML provides information that concerns not only the layout but
also logical structure.
The primary task of content mining is data extraction, where structured data is extracted from unstructured
websites.
The objective is to facilitate data aggregation over various web sites by using the extracted structured data.
Web content mining can be utilized to distinguish topics on the web. For Example, if any user searches for a
specific task on the search engine, then the user will get a list of suggestions.
6
2.WEB STRUCTURED MINING:
The web structure mining can be used to find the link structure of hyperlink. It is used to identify that data either
link the web pages or direct link network.
In Web Structure Mining, an individual considers the web as a directed graph, with the web pages being the
vertices that are associated with hyperlinks.
The most important application in this regard is the Google search engine, which estimates the ranking of its
outcomes primarily with the PageRank algorithm.
It characterizes a page to be exceptionally relevant when frequently connected by other highly related pages.
Structure and content mining methodologies are usually combined. For example, web structured mining can be
beneficial to organizations to regulate the network between two commercial sites.
7
3. WEB USAGE MINING:
Web usage mining is used to extract useful data, information, knowledge from the weblog records, and assists in
recognizing the user access patterns for web pages.
In Mining, the usage of web resources, the individual is thinking about records of requests of visitors of a website,
that are often collected as web server logs.
While the content and structure of the collection of web pages follow the intentions of the authors of the pages, the
individual requests demonstrate how the consumers see these pages.
Web usage mining may disclose relationships that were not proposed by the creator of the pages.
8
CHALLENGES OF WEB MINING
Complexity of required web pages: Basically, there is no cohesive framework throughout the site’s pages so
when compared to conventional text, they are incredibly intricate in the process. The web’s digital library contains
a vast number of documents in the actual system. There is no set order in which these libraries are typically
arranged for the user.
Dynamic data source in the internet: The required online data is updated in real time. For instance, news,
weather, fashion, finance, sports, and so forth is not possible to indicate properly.
Data relevancy: It is much believed that a particular person is typically only concerned with a limited percentage
of the internet throughout the process, with the remaining portion containing data that may provide unexpected
outcomes for the actual requirement and is unfamiliar to the user to verify.
Too much large web: Basically, the web is getting bigger and bigger very quickly in the system.
9
APPLICATIONS OF WEB DATA MINING
Introduces the various applications of web data mining, including market research, competitive analysis,
personalization, and predictive analytics.
Highlights the significance of each application in leveraging web data to achieve business objectives
10
APPLICATION OF WEB MINING:
Web mining has an extensive application because of various uses of the web. The list of some applications of web
mining is given below.
Marketing and conversion tool
Data analysis on website and application accomplishment.
Audience behavior analysis
Advertising and campaign accomplishment analysis.
Testing and analysis of a site.
11
THANK
YOU…
12