SEMANTC WEB AND SOCIAL
NETWORKS
Module 1:
Web Intelligence - Thinking and Intelligent Web
Applications, The Information Age, The World Wide Web,
Limitations of Today’s Web, The Next Generation Web,
Machine Intelligence, Artificial Intelligence, Ontology,
Inference engines, Software Agents, Berners-Lee www,
Semantic Road Map, Logic on the semantic Web.
Introduction
Semantic Web
The Semantic Web is a collaborative movement led by
the World Wide Web Consortium (W3C) that promotes
common formats for data on the World Wide Web.
The Semantic Web provides a common framework
that allows data to be shared and reused across
application, enterprise, and community boundaries.
The term was coined by Tim Berners-Lee is a web of
data that can be processed directly and indirectly by
machines.
Semantic Web
The main purpose of the Semantic Web is driving the
evolution of the current Web by enabling users to find,
share, and combine information more easily.
Metadata
Data about data.
Data providing information about one or more aspects
of the data, such as:
Means of creation of the data
Purpose of the data
Time and date of creation
Creator or author of data
Location on a computer network where the data was created
Standards used
Metadata
Metadata tags provide a method by which computers can categorize the
content of web pages, for example
<meta name="keywords" content="computing, computer studies, computer" />
<meta name="description" content="Cheap widgets for sale" />
<meta name="author" content="John Doe" />
An example of a tag that would be used in a non-semantic web page:
<item>cat</item>
Encoding similar information in a semantic web page might look like this:
<item rdf:about="[Link]
Web crawler
Is a computer program that browses the World Wide
Web in a methodical, automated manner or in an
orderly fashion.
A Web crawler is one type of bot, or software agent. In
general, it starts with a list of URLs to visit, called the
seeds.
As the crawler visits these URLs, it identifies all the
hyperlinks in the page and adds them to the list of
URLs to visit, called the crawl frontier.
URLs from the frontier are recursively visited according
to a set of policies.
Source: Fabric Ventures
Web 1.0
Is a term used to describe the conceptual evolution of
the World Wide Web.
The core principle of web 1.0 is top-down approach
over the use of the WWW and its user interface.
Socially, users can only view web pages but cannot
reflect on the content of the web pages.
Web 2.0
Is the term given to describe a second generation of the
World Wide Web that is focused on the ability for people
to collaborate and share information online.
Web 2.0 basically refers to the transition from static
HTML Web pages to a more dynamic Web that is more
organized and is based observing Web applications to
users.
Examples of Web 2.0 include social networking sites,
blogs, wikis, video sharing sites, hosted services.
Web 2.0
The client-side/web browser technologies used in Web
2.0 development are
Asynchronous JavaScript and XML (Ajax),
Adobe Flash and the Adobe Flex framework, and
JavaScript/Ajax frameworks such as YUI Library,
Dojo Toolkit, Moo Tools, jQuery and Prototype
Web 2.0 uses many of the same as Web 1.0. Languages
such as
PHP, Ruby, Perl, Python, JSP, and [Link] are used by
developers to output data dynamically using
information from files and databases.
Web 2.0
The Web 2.0 websites include the following features
and techniques:
Search
Finding information through keyword search.
Links
Connects information together into a meaningful information ecosystem using the model
of the Web, and provides low-barrier social tools.
Authoring
The ability to create and update content leads to the collaborative work of many rather
than just a few web authors. In wikis, users may extend, undo and redo each other's work.
In blogs, posts and the comments of individuals build up over time.
Web 2.0
Tags
Categorization of content by users adding "tags“ short, usually one-word descriptions to
facilitate searching, without dependence on pre-made categories.
Collections of tags created by many users within a single system may be referred to as
"folksonomies"
Extensions
Software that makes the Web an application platform as well as a document server. These
include software like Adobe Reader, Adobe Flash player, Microsoft Silverlight, ActiveX,
Oracle Java, Quicktime, Windows Media, etc
Signals
The use of syndication technology such as RSS to notify users of content changes
Web 3.0
Its most important features are the Semantic Web and
personalization
TODAY’S Web
A hypermedia, a digital library
A library of documents called (web pages) interconnected by a hypermedia of links
A database, an application platform
A common portal to applications accessible through web pages, and presenting their
results as web pages
A platform for multimedia
BBC Radio 4 anywhere in the world! Terminator 3 trailers
A naming scheme
Unique identity for those documents
A place
Where computers do the presentation (easy) and people do the linking and
interpreting (hard).
Limitations
Not everything is free on the Web.
Items on the Internet are not in a permanent format.
The Web address of a Web page can change.
Security
Database support
Lack of searching capability
Giving irrelevant information
Machine-to-human, not machine-to-machine
Thinking and Intelligent Web Applications
In general, thinking can be a complex process that uses
concepts, their interrelationships, and inference or
deduction, to produce new knowledge.
Thinking is often used to describe such disparate acts
as memory recall, arithmetic calculations, creating
stories, decision making, puzzle solving, and so on.
The term intelligence can be applied to nonhuman
entities.
Artificial Intelligent (AI) is the science of machines
simulating intelligent behavior
Information Age
It is a time when large amounts of information are
widely available to many people, largely through
computer technology.
The Information Age, also commonly known as the
Computer Age or Digital Age, is an idea that the current
age will be characterized by the ability of individuals to
transfer information freely.
The World Wide Web (WWW)
The World Wide Web (WWW) or the Web is a
repository of information spread all over the world and
linked together.
The WWW has a unique combination of flexibility,
portability and user friendly features that distinguish if
forms other services provided by the internet
WWW uses a concept of HTTP which allows
communicate between a web browser and web server.
The web page can be created by using a HTML.
The World Wide Web (WWW)
The WWW today is a distributed client-server, in which
a client using a web browser can access a service using a
server.
Working of a Web
WWW works on client- server approach.
Following steps explains how the web works:
1. User enters the URL (say, [Link] of the web page in the
address bar of web browser.
2. Then browser requests the Domain Name Server for the IP address corresponding to
[Link].
3. After receiving IP address, browser sends the request for web page to the web server
using HTTP protocol which specifies the way the browser and web server
communicates.
4. Then web server receives request using HTTP protocol and checks its search for the
requested web page. If found it returns it back to the web browser and close the HTTP
connection.
5. Web browser receives the web page. It interprets it and display the contents of web page
in web browser‘s window.
HTTP
Hyper Text Transfer Protocol is the network protocol used to deliver files and
data on the Web.
HTTP uses the client–server model:
An HTTP client opens a connection and sends a request message to an HTTP
server; the server then returns a response message, usually containing the resource
that was requested.
After delivering the response, the server closes the connection
HTTP
A simple HTTP exchange to retrieve the file at the URL, first opens a socket to
the host [Link], at port 80 (the default) and then, sends following
through the socket:
GET/path/[Link] HTTP/1.0
From: someuser@[Link]
User-Agent: HTTPTool/1.0
The server responds with the HTTP protocol file followed by the HTML
“hello world” file with the following:
HTTP/1.0 200 OK
Date: Fri, 31 Dec 1999 [Link] GMT
Content-Type: text/html
Content-Length: 1354
<html> <body> Hello World </body> </html>
Working of a web
ARPANET
Advanced Research Projects Agency (ARPA)
The first ARPANET link was made on October 29, 1969,
between the University of California and the Stanford Research
Institute. Only two letters were sent before the system crashed.
The ARPANET became a high-speed digital post office as
people used it to collaborate on research projects. It was a
distributed system of many-to-many‖ connections.
WWW Architecture
WWW architecture is divided into several layers as shown in the following diagram:
WWW Architecture
IDENTIFIERS AND CHARACTER SET
Uniform Resource Identifier (URI) is used to uniquely identify resources on
the web and UNICODE makes it possible to built web pages that can be read
and write in human languages.
SYNTAX
XML (Extensible Markup Language) helps to define common syntax in
semantic web.
DATA INTERCHANGE
Resource Description Framework (RDF) framework helps in defining core
representation of data for web. RDF represents data about resource in graph
form.
TAXONOMIES
RDF Schema (RDFS) allows more standardized description of
taxonomiesand other ontological constructs
WWW Architecture
ONTOLOGIES
Web Ontology Language (OWL) offers more constructs over RDFS. Are formal way
to describe taxonomies and classification of networks.
It comes in following three versions:
OWL Lite for taxonomies and simple constraints.
OWL DL for full description logic support.
OWL for more syntactic freedom of RDF
RULES
RIF and SWRL offers rules beyond the constructs that are available from RDFs and
OWL. Simple Protocol and RDF Query Language (SPARQL) is SQL like language
used for querying RDF data and OWL Ontologies
PROOF
All semantic and rules that are executed at layers below Proof and their result will be
used to prove deductions.
CRYPTOGRAPHY
Cryptography means such as digital signature for verification of the origin of sources is
used.
USER INTERFACE AND APPLICATIONS
On the top of layer User interface and Applications layer is built for user interaction
LIMITATIONS OF TODAY’S WEB
1. The web of today still relies on HTML, which is responsibility for describing
how information is to be displayed and laid out on a web.
2. The web today do not have the ability of machine understanding and processing
of web- based information.
3. The web is characterized by textual data augmented web services as it involves
human assistance and relies on the inter operation and inefficient exchange of the
two competing proprietary server frameworks.
4. The web is characterized by textual data augmented by pictorial and audio-
visual addition.
5. The web today is limited to manual keyboard searches as HTML do not have
the ability to exploit by information retrieval techniques.
6. Web browsers are limited to access existing information in a standard form.
7. On web, development of complex networks with meaningful content is
difficult.
8. Today‘s web is restricted to search, database, support, intelligent, business
logic, automation, security and trust.
MACHINE INTELLIGENCE
Combines a wide variety of advanced technologies to give
machines the ability to learn, adapt, make decisions, and
display behaviors not explicitly programmed into their
original capabilities. Some of machine intelligence capabilities,
such as neural networks, expert systems, and self-organizing
maps, are plug-in components .
Machine Intelligence capabilities add powerful analytical, self-
tuning, self-healing, and adaptive behavior to client applications.
They also comprise the core technologies for many of advanced
data mining and knowledge discovery services
MACHINE INTELLIGENCE
ARTIFICIAL INTELLIGENCE
Artificial intelligence (AI) is the intelligence of machine
The study and design of intelligent agents where an intelligent
agent is a system that perceives its environment and takes actions
that maximize its chances of success
ARTIFICIAL INTELLIGENCE
Intelligent agent:
Programs, used extensively on the Web, that perform tasks such as
retrieving and delivering information and automating repetition
More than 50 companies are currently developing intelligent agent
software or services, including Firefly and WiseWire.
Agents are designed to make computing easier.
Currently they are used as Web browsers, news retrieval
mechanisms, and shopping assistants.
By specifying certain parameters, agents will "search" the Internet
and return the results directly back to PC.
Branches of AI
1. Logical AI
2. Search
3. Pattern recognition
4. Representation
5. Inference
6. Common sense knowledge and reasoning
7. Learning from experience
8. Planning
9. Epistemology
10. Ontology
11. Heuristics
12. Genetic programming
Applications of AI
[Link] playing
[Link] recognition
[Link] natural language
[Link] vision
[Link] systems
[Link] classification
INFERENCE ENGINE
Inference means A conclusion reached on the basis of evidence
and reasoning.
In computer science, and specifically the branches of knowledge
engineering and artificial intelligence, an inference engine is a
“computer program that tries to derive answers from a
knowledge base”.
Architecture
The separation of inference engines as a distinct software
component stems from the typical production system
architecture.
This architecture relies on a data store
I. An interpreter. The interpreter executes the chosen agenda items by
applying the corresponding base rules.
II. A scheduler. The scheduler maintains control over the agenda by
estimating the effects of applying inference rules in light of item priorities
or other criteria on the agenda
III.A consistency enforcer. The consistency enforcer attempts to maintain a
consistent representation of the emerging solution
Architecture
Logic
In logic, a rule of inference, inference rule, or transformation rule is
the act of drawing a conclusion based on the form of premises
interpreted as a function which takes premises, analyses their syntax,
and returns a conclusion
Expert System
Expert system is a computer system that emulates the decision-
making ability of a human expert
ONTOLOGY
Ontology is a formal specification of a shared conceptualization
ontology defines a set of representational primitives with which
to model a domain of knowledge or discourse.
The representational primitives are typically classes (or sets),
attributes (or properties), and relationships (or relations among
class members).
In the context of database systems, ontology can be viewed as a
level of abstraction of data models, Ontologies are typically
specified in languages that allow abstraction away from data
structures and implementation strategies
ONTOLOGY
In the technology stack of the Semantic Web standards,
ontologies are called out as an explicit layer.
Ontology defines (specifies) the concepts, relationships, and
other distinctions that are relevant for modeling a domain.
The specification takes the form of the definitions of
representational vocabulary (classes, relations, and so forth),
which provide meanings for the vocabulary and formal
constraints on its coherent use
ONTOLOGY : KEY APPLICATIONS
Ontologies are part of the W3C standards stack for the Semantic
Web, in which they are used to specify standard conceptual
vocabularies in which to exchange data among systems, provide
services for answering queries, publish reusable knowledge
bases, and offer services to facilitate interoperability across
multiple, heterogeneous systems and databases.
The key role of ontologies with respect to database systems is to
specify a data modeling representation at a level of abstraction
above specific database designs (logical or physical), so that data
can be exported, translated, queried, and unified across
independently developed systems and services.
Successful applications to date include database interoperability,
cross database search, and the integration of web services.
SOFTWARE AGENT
In computer science, a software agent is a software program that
acts for a user or other program in a relationship of agency,
which derives from the Latin agere (to do): an agreement to act on
one's behalf.
The basic attributes of a software agent are that
Agents are not strictly invoked for a task, but activate themselves
Agents may reside in wait status on a host, perceiving context,
Agents may get to run status on a host upon starting conditions,
Agents do not require interaction of user,
Agents may invoke other tasks including communication .
SOFTWARE AGENT
Various authors have proposed different definitions of agents
These commonly include concepts such as
Persistence (code is not executed on demand but runs
continuously and decides for itself when it should perform
some activity)
Autonomy (agents have capabilities of task selection,
prioritization, goal-directed behavior, decision-making without
human intervention)
Social ability (agents are able to engage other components
through some sort of communication and coordination, they
may collaborate on a task)
Reactivity (agents perceive the context in which they operate
and react to it appropriately).
Distinguishing agents from programs
Related and derived concepts include Intelligent agents (in particular
exhibiting some aspect of Artificial Intelligence, such as learning and
reasoning), autonomous agents (capable of modifying the way in
which they achieve their objectives)
Distributed agents (being executed on physically distinct computers),
multi-agent systems (distributed agents that do not have the
capabilities to achieve an objective alone and thus must
communicate)
Mobile agents (agents that can relocate their execution onto different
processors).
Examples of intelligent software agents
Related and derived concepts include Intelligent agents (in particular
exhibiting some aspect of Artificial Intelligence, such as learning and
reasoning), autonomous agents (capable of modifying the way in
which they achieve their objectives)
Haag (2006) suggests that there are only four essential types of
intelligent software agents
1. Buyer agents or shopping bots
2. User or personal agents
3. Monitoring-and-surveillance agents
4. Data Mining agents
Examples of intelligent software agents
1. Buyer agents (shopping bots)
Buyer agents travel around a network (i.e. the internet) retrieving
information about goods and services.
These agents, also known as 'shopping bots', work very efficiently
for commodity products such as CDs, books, electronic
components, and other one-size-fits-all products.
Examples of intelligent software agents
2. User ( personal agents )
User agents, or personal agents, are intelligent agents that take
action. Performs the following tasks:
Check your e-mail, sort it according to the user's order of preference, and alert
you when important emails arrive.
Play computer games as your opponent or patrol game areas for you
Assemble customized news reports for you.
Find information for you on the subject of your choice
Fill out forms on the Web automatically for you, storing your information for
future reference
Scan Web pages looking for and highlighting text that constitutes the "important"
part of the information there
"Discuss" topics with you ranging from your deepest fears to sports
Facilitate with online job search duties by scanning known job boards and sending
the resume to opportunities who meet the desired criteria.
Profile synchronization across heterogeneous social networks
Examples of intelligent software agents
3. Monitoring-and-surveillance (predictive) agents
Monitoring and Surveillance Agents are used to observe and report
on equipment, usually computer systems.
The agents may keep track of company inventory levels.
observe competitors prices and relay them back to the company,
watch stock manipulation by insider trading and rumors, etc.
Examples of intelligent software agents
4. Data mining agents
This agent uses information technology to find trends and patterns in
an abundance of information from many different sources.
The user can sort through this information in order to find whatever
information they are seeking.
A data mining agent operates in a data warehouse discovering
information.
A 'data warehouse' brings together information from lots of different
sources
Data mining" is the process of looking through the data warehouse to
find information that you can use to take action, such as ways to
increase sales or keep customers who are considering defecting.
TIM BERNERS-LEE WWW
Tim Berners-Lee was developing the key elements of the World
Wide Web, he showed great insight in providing Hypertext Markup
Language (HTML) as a simple easy-to-use Web development
language.
The continuing evolution of the Web into a resource with intelligent
features, however, presents many new challenges.
The solution of the World Wide Web Consortium (W3C) is to
provide a new Web architecture that uses additional layers of
markup languages that can directly apply logic.
The impact of adding formal logic to Web architecture and present
the new markup languages leading to the future Web architecture:
the Semantic Web.
TIM BERNERS-LEE
About TIM BERNERS-LEE
Tim Berners-Lee was born in London, England, in 1955. His parents
were computer scientists who met while working on the Ferranti
Mark I, the world‘s first commercially sold computer.
Berners-Lee studied physics at Oxford, graduated in 1976. Between
1976 and 1980, he worked at Plessey Telecommunications Ltd.
followed by D. G. Nash Ltd.
In 1980, he was a software engineer at CERN, the European Particle
Physics Laboratory, in Geneva, Switzerland where he learned the
laboratory‘s complicated information system
In 1989, Berners-Lee with a team of colleagues developed HTML,
an easy-to-learn document coding system that allows users to click
onto a link in a document‘s text and connect to another document.
About TIM BERNERS-LEE
Tim Berners-Lee also created an addressing plan that allowed each
Web page to have a specific location known as a URL.
Finally, he completed HTTP a system for linking these documents
across the Internet.
He also wrote the software for the first server and the first Web
client browser that would allow any computer user to view and
navigate Web pages, as well as create and post their own Web
documents
Semantic Roadmap
Logic on the Semantic Web
The goal of the Semantic Web is different from most systems of
logic.
The Semantic Web‘s goal is to create a unifying system where a
subset is constrained to provide the tractability and efficiency
necessary for real applications.
However, the Semantic Web itself does not actually define a
reasoning engine, but rather follows a proof of a theorem.
This mimics an important comparison between conventional
hypertext systems and the original Web design.
The original Web design dropped link consistency in favor of
expressive flexibility and scalability.
The result allowed individual Web sites to have a strict hierarchical
order or matrix structure, but it did not require it of the Web as a
whole.
Logic on the Semantic Web
Semantic Web cannot find answers, it cannot even check that an
answer is correct, but it can follow a simple explanation that an
answer is correct.
The Semantic Web as a source of data would permit many kinds of
automated reasoning systems to function, but it would not be a
reasoning system itself.
Logic on the Semantic Web
The logic of the Semantic Web is proceeding in a step-by-step
approach building one layer on top of another.
Three important technologies for developing the Semantic Web are,
1. Resource Description Framework
2. Ontology
3. Web Ontology Language
Logic on the Semantic Web
1. Resource Description Framework
Resource Description Framework is a model of statements made about resources and
associated URI.
Its statements have a uniform structure of three parts:
subject, predicate, and object.
Using RDF, the statements can be formulated in a structured manner.
This allows software agents to read as well as act on such statements.
The set of statements can be expressed as a graph.
a series of (subject, predicate, object) triples, or even in XML forms.
The first form is the most convenient for communication between people
The second for efficient processing
The third one allows as flexible communication with agent software.
Logic on the Semantic Web
2. Ontology
Ontology is an agreement between software agents that exchange information.
The required information is obtained by such an agreement in order to
interpret the structure as well as understand the exchanged data and a
vocabulary that is used in the exchanges.
Using ontology, agents can exchange new information can be inferred by
applying and extending the logical rules present in the ontology
Logic on the Semantic Web
3. Web Ontology Language
This language is a vocabulary extension of RDF and is currently evolving into
the semantic markup language for publishing and sharing ontologies on the
World Wide Web.
Web Ontology Language facilitates greater machine readability of Web content
than that supported by XML, RDF, and RDFS by providing additional
vocabulary along with formal semantics.
OWL can be expressed in three sublanguages: OWL Lite, OWL DL, and OWL
Full.
END OF MODULE -1