0% found this document useful (0 votes)
34 views

Ics106 Full

This document discusses information literacy, which is the ability to recognize when information is needed and locate, evaluate, and effectively use that information. It defines information literacy and describes various models of information literacy skills. It also discusses different types of literacies important in the digital age like digital literacy, technological literacy, and media literacy.

Uploaded by

yusufsherifat83
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Ics106 Full

This document discusses information literacy, which is the ability to recognize when information is needed and locate, evaluate, and effectively use that information. It defines information literacy and describes various models of information literacy skills. It also discusses different types of literacies important in the digital age like digital literacy, technological literacy, and media literacy.

Uploaded by

yusufsherifat83
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

ICS 106: INFORMATION LITERACY FOR THE DIGITAL AGE (2

CREDIT (COMPULSORY COURSE)


OVERVIEW OF THE COURSE:

The digital age has contributed to the unprecedented growth in the usage of
information in various forms and sources. Due to the complexity of the information
environment, individuals are faced with diverse, abundant information choices in
their academic studies, workplaces and in their personal lives. Such information is
available in Libraries, Community resources, Organizations, Media, Internet, CD-
ROMs, Databases and Information centers etc.

The concept of information literacy cannot be undermined in an information


environment. Information literacy is common to all disciplines, all learning
environment. It enables learners to master content and extend their investigations,
more self-directed and assume greater control over learning. In essence information
literacy forms the basis for lifelong learning. Varying information literacy skills will
enable users of information accomplish great task.

1.1 WHAT IS INFORMATION LITERACY

Information Literacy: The ability to search for and hence access appropriate
information across a range of genre, formats and systems. The ability to sift, scan
and sort information.

Most of the definitions of information literacy have been in terms of the information
literate person rather than information literacy itself. The following definitions will be
adopted for the purpose of this course.

According to American Library Association (ALA, 1989), “to be information literate, a


person must be able to recognize when information is needed and have the ability
to locate, evaluate and use effectively the needed information.

Doyle (1992) defines an information literate person as one who does the following:

 Recognizes the need for information.


 Recognizes that accurate and complete information is the basis for intelligent
decision making.
 Identify potential sources of information.
 Develops successful search strategies.
 Accesses sources of information, including computer based and other
technologies.
 Evaluates information
 Organizes information into practical application.
 Integrates new information into existing body of knowledge.

1
 Uses information in critical thinking and problem solving.

Christine Bruce (1997b) identified seven different ways of experiencing


information literacy known as ‘the seven faces of information literacy’. These are:

Category one: using information technology for information retrieval and


communication.

Category two: finding information located in different sources.

Category three: executing a process for finding and using information.

Category four: controlling and storing information for easy retrieval.

Category five: building a personal knowledge base in a new area of interest

Category six: working with knowledge and personal perspectives to gain novel
insights.

Category seven: using information wisely for the benefit of others.

Lenox and walker (1993) also defined information literacy by characterizing the
information literate person as one who has the analytical and critical skills to
formulate research questions and evaluate results and the skills to search for
and access a variety of information types in order to meet his or her information
needs.

Shapiro and Hughes (1996) provided broader perspectives to information literacy.


It is referred to as a new liberal act that extend from knowing how to use
computers and access information to critical reflection on nature of information
itself, its technical infrastructure, social, cultural and philosophical context and
impact.

Prague Declaration (2003) defined information literacy as that which


encompasses knowledge of one’s information needs and the ability to identify,
locate, evaluate, organize and effectively use information to address issues or
problem at hand, is a prerequisite for participating in the information society; and is
part of the basic human right of life long learning.

The Chartered Institute of Library and Information professionals (CILIP)


produced a definition of information literacy in 2005. Information literacy is knowing
when and why you need information, where to find it and how to evaluate, use and
communicate it in ethical manner. Information literacy is an essential and discrete
dexterity – everyone relies on information everyday. (Cilip,2004)

2
Critical analyses of the various definitions of information literacy show that anybody
who possesses the information literacy skills is the master of his own learning. He
goes from simply finding and learning facts to the process of creating new
information. Knowledge creation includes:

Prospecting – discovering relevant information. Skills required are selection and


navigation, and then sorting, sifting and selecting pertinent and accurate data.

Interpreting – translate data and information into knowledge.

Creating new ideas (innovation) - showing insight and understanding, as new


knowledge is developed, not a rehash of old information.

1.2 INFORMATION LITERACY SKILLS

Information literacy skills are skills you will need through one’s lifetime. We are
always seeking information in order to take vital decisions, informed choices and
communicate effectively. We need to continually improve our searching, evaluating
and communicating skills for us to be relevant in a changing information
environment. It should be noted that one cannot be information literate or
communication literate overnight. Just as with speaking or writing skills, your abilities
will improve overtime as you gain expertise in the topics you chose to investigates.
This process will give you practice in searching and evaluating the information you
encounter and will allow creating new ideas, and communicating to others using a
variety of technological tools.

1.3 MODEL OF INFORMATION SKILLS

The Big six model designed by Michael Eisenbergand Robert Berkowitz (2001)
will be adopted as one of the most widely used model of information skills. This
model defines how an information literate student can approach research. For
instance how would you investigate a research topic

 Defining your problem (The what ?)


-what is my thesis or problem.
-what information do I need.
-what do I already know.
-what more do I need to find out.

 Information Seeking Strategies (The how & which ?)


-How can I find the information I need
- what are the best possible sources.
- Which databases are the best choices.
- Which type of source will make me solve my information problems.

3
 Selecting and evaluating your resources (Examination)
-critically analyze your sources.
-Examine your sources for relevance, currency, accuracy credibility,
appropriateness and bias

 Organizing and restructuring your information (Organization)


-organizing your information so that it makes sense to you.
and others.

 Communicating the result of your research (Communication ?)

-Who is my audience?

-How can I most effectively share this information with this audience?

-Which would be the best format for communicating the results of my


information? PowerPoint?video? essay? debate? speech? traditional paper?

-What do I need to do this presentation? Equipment?Software?

-Have I included everything I want to share?

-Have I proofread, edited and truly finished my project?

 Evaluating your work (Self Evaluation ?)

-Am I proud of the product? Was it effective?

-Did I meet the guidelines or follow the rubric for the project?

-Am I sure I did not plagiarize from any of my sources?

-Is the best work I could have done?

The process:

-Did I explore the full scope of available resources and select the best?

-Did I approach the research process energetically?

-Did I search electronic resources (the Web and licensed databases) using effective,
efficient, strategic search strategies?

1.4 TYPES OF LITERRACIES IN THE DIGITAL AGE

4
Digital literacy: The ability to use digital technology, communication tools or
networks to locate, evaluate, use and create information. E.g. surfing the internet,
downloading of digital information.

Technological Literacy: The innate ability to discover how a new or evolved technology
operates; recognizing its limitations and benefits. The ability to choose the most appropriate
tool to access and process information and present new knowledge & understanding. E.g.
Scanner, Photocopier, Answering machines, ATM machines etc.

Media Literacy: The ability to synthesize a wide range of viewpoints/interpretations from a


variety of media and build a concise model of understanding of those ideas.

1.5PROBLEMS OF THE DIGITAL AGE

The digital age is an economy that is powered by technology, fuelled by


information and driven by knowledge. People, organizations, institutions have
developed new skills in manipulating and using information and communication
technologies (ICT) tools in meeting their individual and organizational needs.

Traditional computer based technologies and digital communication technologies


have enabled individuals and most especially organizations to communicate and
share information digitally.

The digital age has redefined the act of doing things especially in the work place.
Digital Information which has become a valuable asset in organizations is better
managed. Computer has enabled information to be stored electronically,
manipulated, transferred and retrieved for future use.

The digital age is not without its challenges. These challenges are highlighted
below:

 Basic knowledge and skills in the application and usage of ICT application tools.
 Unequal access to technology especially in the rural areas- digital divide.
 Copyright problems- duplication and copying of other people’s work.
Copyright laws are meant to prevent such acts.
 Plagiarism
 Lack of electricity and telecommunications infrastructures.
 Internet frauds- internet hacking, online frauds by spamming personal details
by criminals for wrong use. (Security issues)
 Lack of IT professionals

1.6THE IMPACT OF INFORMATION AND COMMUNICATION TECHNOLOGIES


(ICT) IN THE DIGITAL AGE

5
ICT is an acronym that stands for Information Communication Technologies. There is
not a universally accepted definition of ICT because the concepts, methods and
applications involved in ICT are constantly changing and evolving on a daily basis.

In a nutshell, ICT covers any product that will store, retrieve, manipulate, transmit or
receive information electronically in a digital form. Examples include personal
computers, digital television, internet, satellite communication gadgets, e-mail,
robots etc. ICT is concerned with storage, retrieval, manipulation, transmission or
receipt of digital data.

ICT is an umbrella term that includes any communication device or application


encompassing radio, Television, Cellular phones, Computer and networks, hardware,
Software, Satellite systems and so on as well as the various services and application
associated with such as video conferencing, distance learning.

ICT has impacted positively to all sectors of the economy- Health, Education,
Banking, Agriculture Government, and Politics.

ICT can be categorized into two broad categories:

I. Traditional Computer based Technologies: This has to do with working


with a personal computer at home or at work through various applications
programs such as word processing, Spread sheets, Database software (Oracle,
Microsoft SQL Server, Access), Presentation Software (Microsoft Power along
with computer screen and projector). Desktop Publishing (Adobe inDesign,
Microsoft Publisher used to produce Newsletter, Magazine and other complex
documents).
II. Digital communication technologies: These types of technologies allow
people and organizations to communicate and share information digitally over
some distance. E.g Satellite communication links, Local Area Networks, Wide
Area Networks, wireless technologies, broadband connectivity

6
1.7IMPACT OF ICT ON THE NIGERIAN ECONOMY

 ICT is the bedrock for National development and survival. Access to global
information to improve all sectors of the economy- Health, Educational,
Agricultural, and Financialto enhance development.
 It enhances National security and law thereby facilitating the efforts in
combating crimes through sophisticated Security monitoring gadgets such as
Closed Circuit Television (CCTV), ICT based security networks, Online
surveillance of street, seaports etc.
 Participation in the international Market. A lot of on-line transactions are carried
out via the internet. For instance the internet has changed the way goods and
services are produced, delivered sold and purchased.
 Creation of wealth: IT Engineers. Software developers have been developing IT
indigenous product to earn foreign exchange.
 ICT can create job opportunities: Software and IT companies are absorbing
graduates with IT skills. This has eradicated unemployment and poverty in the
economy.
 Competitive advantage for public and private sectors. These sectors are able to
offer value added products and services through faster communication with
their customers, promotional strategies and online distribution of their
products.

1.7.1 IMPACT OF ICT ON EDUCATION


 Management of student records (On-line application, Admission Registration
and Exam Procedures).
 Learner Management system (e-learning and virtual library).
 Online communication (e-mail, video conferencing, SMS, Internet, Chat etc.)
 Distance learning and training.
 Basic computer skills.
 Accessing global information to widen your knowledge base.

1.7.2 IMPACT OF ICT ON THE HEALTH SECTOR


 Access to wide range of Medical Journals on-line.
 E-learning (training of health professionals and community health workers).
 Distance learning via internet based materials.
 Dissemination of health information to the public- e.g. health talks on HIV,
Cancer, and Pregnancy etc.
 Development of Health Management information System to plan and manage
healthcare services.

1.7.3 IMPACT OF ICT ON GOVERNMENT

7
 It increases their efficiency.
 Transparency.
 Enriching the lives of the people through the use of modern day technologies
such as Medical treatment databases, cellphones to improve livelihoods.
 It enhances policy development that would improve the plight of the people.
 Access to Government information.
 Information Management and retrieval for future purposes.
 Social interaction between the government and the governed- Radio and
Television programmes.

1.8 BANE OF ICT

 Digital divide- unequal access to information (digital haves and haves not)

 Initial huge capital investment for hardware and software.

 ICT rely on physical infrastructures (electricity, telecommunications) and even

when such facilities are in place difficulties arise when they are poorly

maintained.

 They are dependent on skills and capacity necessary to use, manage and

maintain the technology effectively.

 The majority of information exchanged via ICT whether in text format or

broadcast orally is in the language of the developed countries. Therefore, there

is the need to address the language and cultural barrier in ICT through

significant investment and support for local content in broadcasting, internet

and software design).

8
HISTORICAL PERSPECTIVES OF THE INTERNET
The internet was established in 1969 by the Advanced Research Project Agency
(ARPA) of the Department of Defense, United States of America. The internet evolved
from a software called ARPANET (Advanced Research Project Agency Network)
developed by the Military to combat communication problems that were anticipated
to take place during the nuclear war.
The first services developed on ARPANET were remote Telnet access and file
transfers. The users of ARPANET adapted electronic mail (e-mail) for data sharing
across the globe. From e-mail, a system of discussion groups which became known
as USENET emerged. The TCP/IP which is now used on the internet assisted a great
deal in the transmission process on the ARPANET. The roadmap of Internet
Communication is called TCP/IP or Transmission Control Protocol/Internet Protocol.
TCP/IP was developed in order to allow all of the U.S. military computers to
communicate with each other easily, regardless of the manufacturer of the system.
TCP/IP is the universal translator between all of the different hardware combinations
that might exist.
The internet can be described as “network of networks” which links some millions of
computer networks and several individual computers- from Universities, Government
and individual personal computers into an electronic web that permits information to
be transferred via telephone lines and cables. In other words, the internet is a
mechanism for information dissemination and a medium for collaboration and
interaction between individuals and their computers regardless for geographic
location. The internet being a large scale network of millions of computers allows for
continuous communication across the globe.

The various applications and tools of the internet are as follows:


 World Wide Web (WWW or W3)
 Web Browsers
 Electronic Mail (E-mail)
 File Transfer Protocol (FTP)
 Internet Relay Chart (IRC)
 Usenet (news service)
 Hypertext Markup Languages (HTML)
 Gopher
 Network News Transfer Protocol (NNTP)

World Wide Web (WWW): The web is a collection of all browsers, servers, files
and browser- accessible services available through the internet. The web was created

9
through a computer scientist named Tim Berners-Lec. It is the most widely used
service of the Internet, accessed through a web browser. The web can be referred to
as a collection of web pages linked together with Hypertext links. The web pages
are multimedia in nature comprising of text, pictures, sounds and graphics.

Web Browser: It is a type of software that retrieves and presents information


resources on the internet. The information resource can be text, images, sound,
video or type of content. Examples include Microsoft Internet Explorer, Mozilla
Firefox, Opera, Safari, Google etc. There are two types of browsers:

Non-graphical or text only browsers were the first browsers developed. These
browsers simply display ASCII text on the computer screen. No pictures can be
included. The main advantage of this type of browser is that that they are very fast.

Graphical browsers have the capability of including both text and pictures. These
browsers have the ability to display pictures, play sounds and even shoe video clips.
Unfortunately, graphical browsers tend to run much more slowly than text only
browsers.

Electronic Mail: It is currently the most popular use of the internet. It is the sending
and receiving of electronic messages and files as attachment. Email is used by most
commercial online services for a fee. Before you can send an email, you must know
recipients’ email addresses. These addresses are composed of the user’s
identification followed by the @ sign, followed by the location of the recipient’s
computer. The main benefit of the email is the instantaneous delivery of messages.
Also, identical messages can be sent to different people at the same time. Thirdly, it
saves cost and time.

File Transfer Protocol (FTP): Files are transferred from one computer to the other
by using FTP protocols. FTP will enable people share files like music, videos etc. FTP
allows users to get or download files from an FTP server file directory. This protocol
also allows users to put or upload a file from their local machine to the remote FTP
server

Internet Relay Chat (IRC): It is a service that enables you to communicate to a


chosen channel and talk in real time to people with the same interest as you.

10
Usenet (Unix User Network): It is a kind of electronic newspaper. It is a means of
communicating news over the internet. Individuals, organizations can write an article
and then post the article on a news server. These articles can be read by anyone
with access to a newsreader, a piece of software that allows an individual to access a
newsgroup. It is a system of bulletin boards where you and anyone else can post
messages and people can post messages for people to read and reply to them.
Newsgroup participants are expected to abide by the rules of ‘netiequette’, the
unofficial guide to communicating on the internet.

Gopher: It is a menu based program that enables you to browse for information
without having to know where the material is specifically located. Gopher is one of
the most comprehensive of all browser systems and it allows you to access other
programs including FTP and Telnet. Using Gopher, you can access library catalogs,
files and databases.

Hypertext Transfer Protocol: This is the method by which World Wide Web Pages
are transferred over the network. Hypertext Markup Languages (HTML) are used for
writing pages for the World Wide Web. It allows text to include codes that define
fonts, layout, embedded graphics and hypertext links.

Network News Transfer Protocol: It is an internet standard protocol used for


distribution, inquiry, retrieval and posting of new articles. It offers Bulletin boards,
Chartrooms and Netnews which is a massive system with ongoing conferences called
Newsgroups. To access these newsgroups, you download a special program from the
internet that allows you to participate in any Newsgroup you wish. You subscribe to
those newsgroups that interest you and communicate through a message system
similar to E-mail. You may view an ongoing dialog without participating- this is called
lurking and is encouraged for newcomers. Discussion groups and Chat rooms can be
excellent sources of information. They can also be sources of political debate and an
opportunity to meet people with shared interests.

USES OF INTERNET
 Social networking (Facebook, Twitter
 Dissemination of information through web mails- yahoo, gmail and Hotmail
 Teleconferencing (e.g. meeting can be held at the same time among people in
different countries interactively).
 E-learning

11
 International trade
 Online outsourcing
 Knowledge sharing (Corporate Organizaions, Institutions of learning, Student
exchange programme).
 Entertainment (Games, downloading movies, music etc)
 E-commerce (Online buying and selling, distribution and promotion of products,
auctioning etc)

CHALLENGES OF THE INTERNET


 Copyright problems.
 Loss of privacy (information can be hacked)
 Loss of sales for artists (Movies and Songs are available for free on the
Internet).
 Hazards of online chatting
 Lack of regular supply of electricity
 Computer Viruses and Worms.

INFORMATION OVERLOAD

We live in a world full of information being thrown at us, every moment of the day,
constantly demanding our attention. In our everyday lives, we are being constantly
hit with streams of incoming information.

Information overload occurs when we try to receive more information than can be
processed. The noise this effort creates in our minds and our lives can be
overwhelming. The negative effects of information overload are discussed below:

Productivity Loss – In the face of too much information, we can easily get lost in
the details. We waste time focusing on unimportant information and lose sight of our
goal and purpose. The extra data distracts away from our major tasks for the day.
How often have you turned on your computer to check email, and ended up surfing
the net for hours?

Mind Cluttering – The noise created by media, and other sources of information,
clutters our mind and takes away from our inner peace.

Lack of Time – Rich or poor, young or old, we all have the same limited amount of
time in a day. A whole lot of time is expended on sifting, sorting and evaluating
information sources.

12
Lack of Personal Reflection – This comes when we constantly consume
information, then forgetting to connect with ourselves. Valuable personal reflection
comes when we create a ‘space’ for it in our lives. An example is the person who
constantly has the radio on. If there is always noise, then we won’t have the mental
capacity to reflect within.

Stress & Anxiety – Information inflow creates the illusion that we have more tasks
to fill our lives, than we have time for. Often, we might suddenly feel nervous without
understanding why. Every piece of information carries with it energy, which demands
our time. Even if we consciously ignore it, part of us saw that data and recorded it
within our subconscious. So, we feel that we have lots and lots to do. This can create
stress. Too much of a good thing is never good, and this is especially true of
information. We can’t live without a certain amount of information, and much of it is
unavoidable anyway.

COMBATING INFORMATION OVERLOAD


 Reduce your information intake to the essentials.
 Planning (schedule time to work on each of the essential information task)
 Set time limits (By setting time limits it forces us to get down to the bare
essentials)
 Try an information diet i.e. you may decide to go without checking an
information source for a set amount of time e.g. low tech days, email fast,
phone free periods.

EVALUATING ELECTRONIC SOURCES

Many of the traditional principles that are being applied to judge the appropriateness
or quality of printed materials are equally applicable to electronic sources.
Information Managers need to determine on the criteria to determine the relevance
of internet/ electronic sources.

In recent times, with the advent of the World Wide Web, there had been massive
influx of digital information and sources. There is a wide difference between what is
found on the web and what is found in traditional print sources. Therefore,
understanding the differences between the types of sources will assist in evaluating
the sources. For instance most internet sources do not have print equivalent while

13
sources such as journal or newspaper articles can be found in both print and digital
formats.

REASONS FOR EVALUATING INTERNET SOURCES

Evaluation of internet sources is essential for the following reasons:


 Anyone can create and publish an Internet site, regardless of subject expertise
or knowledge
 Web page creation software makes it easy for anyone to create a professional
looking web page without consideration for the quality and content of the
material.
 No standards exist to ensure the quality and accuracy of information on a web
page
 Academic print resources have a review process, there is no such thing in
regard to individual website. Therefore, websites produced by well known
entities need to be used with caution

1.2 DIFFERENCES BETWEEN PRINT SOURCES AND INTERNET SOURCES

PRINT SOURCES INTERNET SOURCES


Publication process Most web documents do not have
 Extensive publication processes editors/ reviewers.
are being adhered to such as
editing, article review, fact-
checkers, multiple reviewers
and editors to ensure quality of
publication.
Authorship and Affiliation They may be difficult to determine on
 Who is the author? the internet
 What makes her/him an
authority on the subject?
 What experience or credentials
are listed for the author?
 Is she/he affiliated with a
reputable organization?
 If an educational background is
given, are the educational
institutions accredited?
 Is there contact information for
the author (i.e., an address or
phone number)?
 When his or her work was
published.
External Sources and Quotations They may not be clearly stated.

14
 External sources of information
are clearly stated and
identified. E.g. Bibliographies
and relevant citations.
 Publication information such as Date of publications is questionable
date of publication, name of on the internet. Date stated on the
publisher, authors and editors website could be date posted or date
are always indicated in print of last update.
sources.

The criteria below will assist you in evaluating web pages for use as academic
sources. These multiple categories should be employed prior to making a decision
regarding the academic quality of a source.

(a) Location of the website

 How you located the site can give you a start on your evaluation of the site's
validity as an academic resource.

 Was it found via a search conducted through a search engine? Unlike library
databases, the accuracy and/or quality of information located via a search
engine will vary greatly.

 Was it recommended by a faculty member or another reliable source?


Generally, an indicator of reliability.

 Was it cited in a scholarly or credible source? Generally, an indicator of


reliability.

 Was it a link from a reputable site? Generally, an indicator of reliability.

(b) Website's domain.

The domain of a particular website can be decoded through the Universal Resource
Locator (URL), or Internet address. The origination of the site can provide indications
of the site's mission or purpose. The most common domains are:

org : An advocacy web site, such as a not-for-profit organization.

.com : A business or commercial site.

15
.net: A site from a network organization or an Internet service provider.;

.edu : A site affiliated with a higher education institution.

.gov: A federal government site.

.il.us : A state government site, this may also include public schools and
community colleges.

.uk (United Kingdom) : A site originating in another country (as indicated by the 2
letter code).

~: The tilde usually indicates a personal page.

Authority

 Who is responsible for the site. Therefore, you look out for information on the
author of the site. This is because on the Internet anyone can pose as an
authority.
 Does the author have an affiliation with an organization or institution?
 Does the author list his or her credentials? Are they relevant to the information
presented?
 Is there available feedback facility- mailing address, telephone numbers and e-
mail address to contact the author?
 Is the material available in other forms- print, CDs .
Accuracy and Objectivity

There are no standards or controls on the accuracy of information available via the
Internet.

The Internet can be used by anyone as a sounding board for their thoughts and
opinions.
 Does the author support his idea by citing from other sources?
 Determine the nature of the article. e.g. scholarly articles must include
citations and bibliographies.
 Are the citations and bibliographies complete to find the original sources.
 Compare the page to related sources, electronic or print, for assistance in
determining accuracy.
 Does the page exhibit a particular point of view or bias? E.g. The implication
of legalizing abortion.
 Is the site objective? Is there a reason the site is presenting a particular point
of view on a topic?
 Does the page contain advertising? This may impact the content of the
information included. Look carefully to see if there is a relationship between

16
the advertising and the content, or whether the advertising is simply
providing financial support for the page.
 Do you have to have a pin or credit card to proceed?
 Free from grammar and spelling errors.
 Are there links to other resources on the site?
 Do links take you to "Not found" pages, out of date or irrelevevant pages?
Currency

This is both an indicator of the timeliness of the information and whether or not the
page is currently maintained. The following questions may be asked to determine
currency of the site.
 Is the information provided current?
 When was the page created?
 Are dates included for the last update or modification of the page?
 Are the links current and functional?

Functionality (Ease of Use)

The ease of use of a site and its ability to help you locate information you are looking
for are examples of the site's functionality.
 Is the site easy to navigate? Are options to return to the home page, tops of
pages, etc., provided?
 Is the site searchable?
 Is there a site map or table of contents?

ASSESSMENT OF WIKIPEDIA AS AN INTERNET SOURCE

Many teachers, professors, librarians and other education professionals view


Wikipedia as a bad source of information. Why?
 Authorship of Wikipedia pages is anonymous
o This means the qualifications of the author is unknown
o The author could be an a college student or researcher
 A Wikipedia entry can be edited by anyone at anytime
o This means incorrect information could be substituted for correct
information.
o Information you found in an entry may not be there when you go back
the next time or ever again
 Many Wikipedia entries do not include proof of their assertations
o Meaning you can not verify the accuracy of the information

17
INFORMATION RETRIEVAL
Information retrieval performance evaluation
• "Recall" and "Precision" are two classic measures to measure the performance
of information retrieval in a single query.
• Both assume that there is an answer set of documents that contain the answer
to the query.
• Performance is optimal if
– the database returns all the documents in the answer set
– the database returns only documents in the answer set
• Recall is the fraction of the relevant documents that the query result has
captured.
• Precision is the fraction of the retrieved documents that is relevant.
Evaluation
• Why Evaluate?
• What to Evaluate?
• How to Evaluate?
Why Evaluate?
• Determine if the system is desirable
• Make comparative assessments
• Test and improve IR algorithms
What to Evaluate?
• How much of the information needed is satisfied.
• How much was learned about a topic.
• Incidental learning:
– How much was learned about the collection.
– How much was learned about other topics.

Relevance
• In what ways can a document be relevant to a query?
– Answer precise question precisely.
– Partially answer question.
– Suggest a source for more information.
– Give background information.
– Remind the user of other knowledge.
– Others ...
• How relevant is the document
– for this user for this information need.
• Subjective, but
• Measurable to some extent
– How often do people agree a document is relevant to a query
• How well does it answer the question?
– Complete answer? Partial?
– Background Information?
– Hints for further exploration?
What to Evaluate?
What can be measured that reflects users’ ability to use system? (Cleverdon 66)
– Coverage of Information

18
– Form of Presentation
– Effort required/Ease of Use
– Time and Space Efficiency Effe
– Recall ctiv
• proportion of relevant material actually retrieved enes
– Precision s
• proportion of retrieved material actually relevant

Precision and Recall


• In information retrieval contexts, precision and recall are defined in terms of a
set of retrieved documents (e.g. the list of documents produced by a web
search engine for a query) and a set of relevant documents (e.g. the list of all
documents on the internet that are relevant for a certain topic), cf. relevance.

Precision
• In the field of information retrieval, precision is the fraction of retrieved
documents that are relevant to the search:
• Precision takes all retrieved documents into account, but it can also be
evaluated at a given cut-off rank, considering only the topmost results returned
by the system. This measure is called precision at n or P@n.
• For example for a text search on a set of documents precision is the number of
correct results divided by the number of all returned results.

Precision = No of relevant record retrieved


Total No of records retrieved

• Note that the meaning and usage of "precision" in the field of Information
Retrieval differs from the definition of accuracy and precision within other
branches of science and technology.

Recall
• Recall in information retrieval is the fraction of the documents that are relevant
to the query that are successfully retrieved.
• For example for text search on a set of documents recall is the number of
correct results divided by the number of results that should have been returned

Recall = No of relevant records retrieved


Total no of relevant record
• In binary classification, recall is called sensitivity. So it can be looked at as the
probability that a relevant document is retrieved by the query.
• It is trivial to achieve recall of 100% by returning all documents in response to
any query. Therefore, recall alone is not enough but one needs to measure the
number of non-relevant documents also, for example by computing the
precision.

19
• In pattern recognition and information retrieval, precision (also called positive
predictive value) is the fraction of retrieved instances that are relevant, while
recall (also known as sensitivity) is the fraction of relevant instances that are
retrieved. Both precision and recall are therefore based on an understanding
and measure of relevance.
• Suppose a program for recognizing dogs in scenes identifies 7 dogs in a scene
containing 9 dogs and some cats. If 4 of the identifications are correct, but 3
are actually cats, the program's precision is 4/7 while its recall is 4/9. When a
search engine returns 30 pages only 20 of which were relevant while failing to
return 40 additional relevant pages, its precision is 20/30 = 2/3 while its recall
is 20/60 = 1/3.
• Precision can be seen as a measure of exactness or quality, whereas recall is a
measure of completeness or quantity.
• In simple terms, high recall means that an algorithm returned most of the
relevant results, while high precision means that an algorithm returned
substantially more relevant results than irrelevant.

Why Precision and Recall?

Get as much good stuff while at the same time getting as little junk as
possible.

20
PLAGIARISM

What is Plagiarism?
Plagiarism is the act of taking another person’s words, ideas or statistics and passing
them off as your own. The complete or partial translation of a text written by
someone else also constitutes plagiarism if you do not acknowledge your source. It
can also be described as an act made when a writer deliberately uses someone
else’s language, ideas or other original material without ac acknowledging its source.
Since we cannot always be original it is entirely acceptable to present another
person’s ideas in your work. However, it must be done properly to avoid plagiarism.

For Example, taking a look at the source below:

Over time technology has been instrumental in increasing industrial and


agricultural production, improving transportation and communications, advancing
human health care and overall improving many aspects of human life. However,
much of its success is based on the availability of land, water, energy, and biological
resources of the earth.
* David PIMENTAL (1998) “Population Growth and the Environment: Planetary
Stewardship”, Electronic Green Journal: Vol. 1: No. 9, Article 10.
http://repositories.cdlib.org/uclalib/egj/vol1/iss9/art10

What is Unacceptable

Research has shown that technology Other than the first four words, the text
has been instrumental in increasing has been copied word for word from the
industrial and agricultural production, original document without any quotation
improving transportation and marks that would indicate that the
communications, advancing human passage
health care and overall improving many is a quote.
aspects of human life. However, much • The source you are using is not cited.
of its success is based on the
availability of land, water, energy, and
biological resources of the earth.

Research has shown that the Even though you mention your source,
advancement of technology has been you use many of the author’s words
instrumental in increasing industrial without quotation marks.
and agricultural production, improving
transportation and communications,
health care and overall many aspects
of human life. (Pimental, 1998)
Research has shown that the • Though most of the words have been
advancement of science has been changed, the sentence structure has
beneficial to the areas of agricultural remained the same.
and industrial production and • This is paraphrasing without indicating
communication and transportation the original source.
fields.
Furthermore, science has greatly

21
improved health care and is the prime
factor in a higher standard of life for
many people.

What is Acceptable

In his article on the effects of The author has been acknowledged, and
population growth on the environment, the quoting technique which has been
Pimental argues that “technology has used is adequate since this is an Internet
been instrumental in increasing source.
industrial and agricultural production, However, when you quote a printed
improving transportation and source
communications, advancing human (book, journal, etc.), be sure to include
health care and overall improving many the
aspects of human life. However, much page numbers.
of its success is based on the
availability of land, water, energy, and
biological resources of the earth”
(1998).
According to Pimental, “technology has You have properly quoted and
been instrumental in increasing paraphrased
industrial and agricultural production, the author.
improving transportation and
communications, advancing
human health care and overall
improving many aspects of human life”
(1998). He cautions, however, that
technological progress is dependent on
natural resources.

According to Pimental (1998), This is the proper way to paraphrase and


technology has greatly improved our the author’s ideas have been credited.
standard of living. He cautions,
however, that technological progress is
dependent on natural resources.

Consequences of Plagiarism
The consequences of plagiarism can be personal, professional, ethical, and legal.
With plagiarism detection software so readily available and in use, plagiarists are
being caught at an alarming rate. Once accused of plagiarism, a person will most
likely always be regarded with suspicion. Ignorance is not an excuse. Plagiarists
include academics, professionals, students, journalists, authors, and others.

Destroyed Student Reputation


Plagiarism allegations can cause a student to be suspended or expelled. Their
academic record can reflect the ethics offense, possibly causing the student to be

22
barred from entering college from high school or another college. Schools, colleges,
and universities take plagiarism very seriously. Most educational institutions have
academic integrity committees who police students. Many schools suspend students
for their first violation. Students are usually expelled for further offences.

Destroyed Professional Reputation


A professional business person, politician, or public figure may find that the damage
from plagiarism follows them for their entire career. Not only will they likely be fired
or asked to step down from their present position, but they will surely find it difficult
to obtain another respectable job. Depending on the offense and the plagiarist’s
public stature, his or her name may become ruined, making any kind of meaningful
career impossible.

Destroyed Academic Reputation


The consequences of plagiarism have been widely reported in the world of academia.
Once scarred with plagiarism allegations, an academic’s career can be ruined.
Publishing is an integral part of a prestigious academic career. To lose the ability to
publish most likely means the end of an academic position and a destroyed
reputation

Legal Repercussions
The legal repercussions of plagiarism can be quite serious. Copyright laws are
absolute. One cannot use another person’s material without citation and reference.
An author has the right to sue a plagiarist. Some plagiarism may also be deemed a
criminal offense, possibly leading to a prison sentence. Those who write for a living,
such as journalists or authors, are particularly susceptible to plagiarism issues. Those
who write frequently must be ever-vigilant not to err. Writers are well-aware of
copyright laws and ways to avoid plagiarism. As a professional writer, to plagiarize is
a serious ethical and perhaps legal issue.

Plagiarized Research
Plagiarized research is an especially highest form of plagiarism. If the research is
medical in nature, the consequences of plagiarism could mean the loss of people’s
lives.

23
Assessing Relevance of Search Results with Recall and Precision Ratios

Introduction

The Web has become an ocean of information and resources, which is growing rapidly larger
every microsecond. It has grown from an esoteric system used by a small community of
researchers to the now most used system for obtaining information for billions of digital citizens
or “digizens”. A growing proportion of such people are also digital natives (i.e young people
born only since the digital era began about three decades ago).

The Web is both a huge database of web pages you can search through, as well as a gateway you
can use to get to and search the information systems or databases of various organizations for
information. Such information systems or databases include those of online stores like Amazon
and Jumia, app stores like Google Play or Samsung Galaxy, as well as the online public access
catalogues (OPACs) of libraries. Outside of the Web, people often directly search the information
systems or databases of organizations in which they are staff, students, customers, or permitted
visitors. On the Web however, search engines such as Google or Yahoo often do the searching of
these other systems and databases for people, thereby saving them from having to search these
other systems themselves.

Many people have never encountered, and thus have no interest in the issues and challenges of
retrieving information from such databases (Oppenheiem, et al., 2000). However, all information
found on the Web through search engines or directly from other information systems or
databases usually need to be evaluated and filtered, as it may include plenty of non-relevant
information. The Web surfer may not be aware of the many available search engines that can be
used to get information on a topic, and may use different search strategies, some of which might
not be effective (Kumar and Prakash, 2009).

The basic changes that a searcher faces when searching the Web, other information systems or
databases can be stated in the form of the following questions, which we often ask ourselves
when we search the Web or database: =====researcher prerequisite questions

(a) Which search engine or database would get me quickly the best search results that
include only or mostly relevant information and also exclude all or most of the irrelevant
information?
(b) What search expression (or query) comprising important words, terms, names, URLs,
etc) best describes my information need that I should input to the search engine or
database?
(c) How can I determine quickly which items in the initial search results provided by the
search engine are most relevant to my needs?
(d) How many items or pages of the search results should I look at before determining if the
results are excellent, good, fair, poor or adequate for my needs?
(e) If the initial search results are not adequate, how do I revise or refine my initial query to
get better subsequent search results?
(f) When should I end the search, satisfied or frustrated?

We all ask and attempt to answer these questions in our minds when we search. Accordingly,
searching is usually a multistep and iterative process. As an iterative process, an initial query
may not do too well initially, and may need to be improved which may entail using new words
and terms identified from the previous search result(s), in order to improve the relevancy of
items in the obtained final search result.

Accordingly, search process includes the following steps: ====== search process steps

 Define the search request (i.e. describe the information need) as precisely as possible
1. Choose an appropriate information resource (search engine, full test or bibliographic
database(s), library catalogues, document repository)
2. Identify and list relevant search terms derived from your search request that you would
use for searching.
3. Modify the search terms to suit the chosen information resource (i.e. use the vocabulary
dictionaries of each information resource to get equivalent or other terms used by the
resource)
4. Combine the modified or augmented search terms to create a search query
5. Run your initial search
6. Evaluate your initial search to determine how good the search results is, but examining
some of the retrieved items in the search result
7. Modify your search query based on the previous results and run new searches.
8. Copy, paste and save selected search results in a file or reference management system (A
reference management system is a software which provides facilities for organizing,
and storing the bibliographic details and content of sources you desire to use later).

You should bear in mind the following points. Firstly, step 6 above requires that you evaluate
each search result, usually while still working with the search engine, in order to determine how
good each search result is overall, and to also determine which items in each search result you
need to copy, paste and save in Step 8. Secondly, the quality of the search results provided by a
search engine or information system depends critically on (a) what you yourself do in steps 1 to
6, as well as how good the search engine is in matching terms in your search queries with words,
terms, and phrases in its database. Often, because search engines and their databases have been
researched and built to be effective and efficient when searching as much as possible, the
quality of what you get often depends on you!

Let us start from the beginning of the search process by considering step 1 above, which requires
that you define or describe your information need with appropriate words simply, precisely and
adequately. For example, consider an information need implied by this question: What are the
effects of e-books on tertiary education students? Five different key words or concepts can be
drawn out from the question which are: effects, e-books, tertiary, education, students. Then, in
step 2 you need to identify other terms that are synonyms of the initial concepts, as shown in
Table 1. In step 3, you need to find out the terms actually used by the chosen information system,
which might or might not be the same as the initial or synonym concepts. Usually, the best
search result is obtained when a searcher uses the same terms to search as the terms that
were used by a search engine or information system when it was indexing its resources.
Finally, steps 4 to 7 also depend on you – how you combine the concepts in the initial and
subsequent queries, and how you evaluate the initial and subsequent search results. In a nutshell,
in order to obtain the best search results, all the steps in the search process must be well
conducted.

Table 1: example of search terms for three different concepts


Concept 1 Concept 2 Concept 3 Concept 4 Concept 5
Concept from
effects e-books tertiary education students
Question
electronic
influence collegiate training
books
Synonyms of digital
impact higher
concept books
online
consequence
books
Concept used
by online
impact tertiary education students
information books
resource

Although what you get usually depends on you, various yardsticks or metrics have been
researched and recommended for use to evaluate the search results performance of search
engines and other information systems. The rest of this chapter examines and explain the most
common of these metrics.

Recall and Precision ratios

The earliest suggested and most commonly mentioned yardsticks are known as recall and
precision ratios. Recall is the ratio of the number of retrieved relevant records to the total
number of relevant records in the database. It is expressed as a percentage %. Precision is the
ratio of the number of retrieved relevant records to the total number of both irrelevant and
relevant records retrieved. It is usually expressed as a percentage %. A simple yet good illustration
of the ideas of recall and precision is via following possible real life usage of recall and precision.
Imagine that, your girlfriend gave you a birthday surprise every year in the last 10
years. However, one day, your girlfriend asks you “Sweetie, do you remember all
birthday surprises from me?” This simple question is likely to be tough to answer
because you need to recall all 10 surprising events from your memory.

Let us suppose your girlfriend has a particular set of 10 surprises in her mind which is what she
wants to be told or her information need.

Recall ratio is the number of events (surprises) you can correctly recall divided by the number
of all the correct events (that she wants you to recall). Recall ratio means total event she has
in mind. It measures how effectively you are able to recall correct events out of the total correct
events (i.e. the particular set of surprises she has in mind).

So, (1) if you can recall all 10 events correctly, then, your recall ratio is 10 / 10 = 1.0 (or 100%),
while (2) if you can recall only 7 events correctly, your recall ratio is 7 / 10 = 0.7 (70%).

Precision ratio is the number of events you can correctly recall divided by the number of all
events you are able to recall (usually comprising a mix of correct and wrong answers). In other
words, the precision ratio measures how precise and efficient your recall efforts are.

Suppose that in example (1) above you made exactly 10 attempts in getting the 10 correct events.
Then your precision ratio is 10 correct recalls divided by 10 recall attempts, which is also 1.0
(100%). However, in example (2), you also made 10 recall attempts, but got only 7 correct
recalls. So, the preciseness or efficiency of recalling the events is the 7 correct recalls divided by
10 recall attempts, which is 0.7 (70%).

Suppose you can actually recall many surprise events some of which correctly in the last ten
years, while the others were not. Suppose (3) you eventually told her 16 events in 16 recall
attempts, out of which only 8 events are among the particular 10 events she has in mind. In that
case, your recall ratio is the 8 correct events out of the 10 she has in mind, which is 8 / 10 =
0.8 (80%). Your recall ratio improved by 10%, but only after six more attempts beyond scenario
(2) above. You improved your recall ratio by 10%, which means that your effectiveness in
recalling correct events improved by 10%, but at the cost of 6 (60%) more attempts.

Would you say you are becoming more precise or efficient in example (3)? Actually your
precision ratio in example (3) is only 8 correct events out of 16 recalled events, which is only
8 / 16 = 0.5 (50%). Your recall ratio improved by 10%, but your precision ratio decreased
by 20%. So you have become more effective at recalling correctly, but less efficient in doing
so.

Recall and Precision ratios are inversely related


The above examples illustrated a natural inverse relationship between the concepts and
measurement of recall and precision ratios of the search results provided by search engines in
response to user queries. This can be shown both mathematically and graphically.

Mathematically,

1
Recall =
Precision

Which you can and should confirm this using the recall and precision ratios calculated in
examples (1) to (3) above. Recall ratios range from 0 to 1, likewise precision ratios. The inverse
or tradeoff relationship that exists naturally between them for search results which is provided by
an information system in response to different queries that users provide is illustrated in Figure 2.

In the figure, the two distinct lines may represent the recall - precision graphs of two search
engines or information systems. While the exact slope of the curve may vary between systems,
the general inverse relationship between recall and precision remains for every information
system.

A system can increase its ability to recall by returning more documents; because the
recall ratio is a non-decreasing function of the number of documents retrieved (a non-decreasing
function always rises or stays at same level). A system that returns all documents in its
database for a query will surely have 100% recall of all the relevant items in its database!
But the precision ratio of such response to the query will be very low, due to the likely higher
number of non-relevant items returned along with the relevant items. The converse is also true,
as it is possible for a system to aim for high precision, but at the cost of very low recall of
relevant items from its database.

This naturally occurring inverse relationship between precision and recall ratios forces
information systems designed for general use to go for compromise between them. But, in real
life, some information search tasks particularly need good precision, whereas others need good
recall.

(b) A Combined measure: F score

A combined measure that measures simultaneously the recall (R) and Precision (P) performance
of the search results from an information system is the F score (weighted harmonic mean). The
F-score is a measure which combines both recall and precision measures using a weighting
factor α, where high α means that precision is more important.

1 ( β 2+ 1 ) PR
=
F= 1 1 β 2 P+ R … (1)
α +(1−α)
P R

The harmonic mean is a very conservative average. People usually use balanced F1 measure.

1
i.e. with β = 1 (that is, α = 2 ) … (2)

Applying (2) in (1), we have:

2 PR
F = ( P+ R) …(3)

Concept of Relevance

Relevance is assessed relative to an information need, not a query. An information need differs
in respect to Information seeking behavior, for example, information need might be on
whether studying core IT related courses are more versatile at ameliorating book haram
syndrome than any pure science courses. This might be translated into a query such as: IT AND
courses AND science AND book haram AND syndrome AND versatile. A document is
considered relevant if it addresses the stated information need, not because it houses exact
words in the query. This distinction is often misconstrued in practice, because the information
need is not overt, despite the fact that, an information need is present.

An illustration goes thus, if a user types “Job” into a web search engine, he might intend to
search for available employment or story about Prophet Job in Abrahamic religion. From a word
query, it is very uneasy for a system to know what the precise information need is. But, the
information user has one, who could solely filter the returned results on the basis of their
relevance to the information need. Therefore, to evaluate a system, an overt expression of an
information need is principally required, which can be used for assessing returned documents as
being relevant or non-relevant. At this point, simplicity is made: relevance could be thought of as
a scale, with some documents highly relevant and others contrary.

RELEVANCE ANALYSIS AND BINARY DECISION

Mathematically,

|{ relevant documents } ∩ {retrieved documents }|


Precision =
|{ retrieved documents }|
|{ relevant documents } ∩ {retrieved documents }|
Recall =
|{ relevant documents }|
N.B:

 Relevant documents are the Positives (horizontally yellowed)


 Retrieved documents are classified as Positives (vertically pinked)
 Relevant and Retrieved are the True Positives (the intersection)

Accuracy

Accuracy is the fraction of decisions, be it relevant or nonrelevant, that are correct. In


recognition of the contingency table above,

TP+TN
Accuracy =
TP+TN + FP+ FN

Accuracy has been seen in literature as not a useful measure for web information retrieval,
this is as a result that only a small fraction of documents in IR system collection are
relevance, (i.e TN >> TP), even if there is a good IR system which only retrieve relevant
documents, the accuracy between this good IR system with a poor system (such as always return
nothing) is small, thus this measurement can’t help us evaluate IR system (Schellekens, 2012).

Ranking of retrieved documents


It is understood that documents are to be critically ranked based on its
estimated relevance to a query. Many factors are to be considered for this ranking, among them
include:

 Term Frequency (TF) – i.e the frequency of occurrence of query keyword in a particular
document
 Inverse Document Frequency (IDF) – i.e the number of documents
where the query keyword occurs in, for fewer documents give more importance to
keyword and vice versa.
 Hyperlinks to documents – i.e the more the links to a document the worthier its
importance

Relevance ranking based on Term Frequency and Inverse Document Frequency (TF/IDF)

Term Frequency (TF) is the determinant of the degree of relevant documents to a query.
Therefore, use of term frequencies makes “spamming” easy. Below are the means of having TF
in documents.

 Position: Greater importance is accorded to the words occurring in the


title, author list, and section headings
 Proximity: Higher importance is connected to the query keywords occurring
close together in the document
 Stop words such as “a”, “an”, “the”, “it” are eliminated
 Orderliness: Documents are returned in decreasing order of relevance score, usually only
top few documents are returned, not all.

TFIDF (Term frequency/Inverse Document frequency) ranking:

Let n(d) = number of terms in the document d


n(d, t) = number of occurrences of term t in the document d.

 Relevance of a document d to a term t

n(d , t)
TF (d, t) = log (1 + n( d) ) (4)

The log factor in (4) is to prevent excessive weight to the frequent terms

 Relevance of document d to query Q


TF (d , t)
r (d, Q) = ∑ n( t) (5)
t ∈Q

Relevance ranking based on Hyperlinks


 Social networking family theories that ranked people of repute, such as Donald Trump,
Muhammadu Buhari, Robert Mugabe who have high prominence due to their level of fame.

 Hub vs Authority based ranking


 A hub is a page that houses links to various pages on a particular topic
 An authority is a page that harbors exact information on a particular topic

In such essence, each page gets a hub status regarding the authorities prestige it points towards,
while each page gets an authority status regarding the hubs prestige it point towards.

Calculations

Two Information retrieval systems, A and B, are to be compared. Provided both are given the
same query which is applied to a collection of 1000 documents. The result obtained showed that
System A returns 420 documents, of which only 50 are relevant to the query, while System B
returns 90 documents, of which only 25 are relevant to the query. 80 documents are relevant to
the query within the whole collection.

Tabulate the results for each system, and compute the following:

 Recall;
 Precision;
 Accuracy; and
 F score for both A and B.

Solution

System A Relevant Non-relevant Total


Returned 50 370 420
Not returned 30 550 583
Total 80 920 1000

System B Relevant Non-relevant Total


Returned 25 65 90
Not returned 55 855 910
Total 80 920 1000

TP TP
 System A’s Recall = TP+ FN System A’s Precision = TP+ FP
50 50
System A’s Recall = = 0.625 = 62.5% System A’s Precision = =
80 420
0.119 = 11.9%

TP TP
 System B’s Recall = TP+ FN System B’s Precision = TP+ FP

25 25
System B’s Recall = = 0.313 = 31.3% System B’s Precision = =
80 90
0.278 = 27.8%

TP+TN
 System A’s Accuracy = TP+TN +FP+ FN

50+ 550 600


System A’s Accuracy = = = 0.6 = 60%
50+ 550+ 370+30 1000

TP+TN
 System B’s Accuracy = TP+TN + FP+ FN

25+855 880
System B’s Accuracy = = = 0.88 = 88%
25+ 855+65+55 1000

2 PR
 System A’s F = (P+ R)
2 × 0.119 ×0.625 0.149
System A’s =
(0.119+ 0.625)
= 0.744
= 0.2

2 PR
 System B’s F = ( P+ R)
2 × 0.278× 0.313 0.174
System B’s =
(0.278+0.313)
= 0.591
= 0.294
Do-It-Yourself Exercise

Assume a database contains 800 records on a particular topic, a search was conducted on that
topic and 620 records were retrieved, of the 620 records retrieved, 405 were relevant. Calculate
the precision, recall, accuracy, and F scores for the search

Conclusion

The two fundamental IR evaluation measures are Precision and Recall. Both are the foundations
for many other developed metrics because of their easier understanding by all information users.
To the practitioner’s view, these two evaluation measures are essential because they lead to more
intuitive resolutions such as, the time spent by people in reading worthless documents (low
precision), or the number of relevant documents being missed (low recall). This is to buttress the
fact that recall is inversely proportional to the number of relevant documents per topic.

Both precision and recall are to be addressed more considerations when evaluating retrieval
systems. It is not sufficient to pick one at the expense of the other; this is by the virtue that
dependence on just one of the duo can lead to extreme but unhelpful solutions. For example, a
system that returns every document indiscriminately has 100% recall; while one that returns only
a single correct document is 100% precise. As information retrieval systems, the former is no
plausible at all, and the latter is not much better. From the analysis, it could be observed that we
use precision, recall, and F for evaluation, but not accuracy.

References

1. Examples: Jurafsky, D. & Manning, C. (Producers), (2012). Evaluation of text


classification - Precision, Recall, and the F measure [MP4]. Available from
https://www.youtube.com/watch?v=2akd6uwtowc. Stanford NLP.
2. Examples: Lavrenko, V. (Producer), (2014). Evaluation: building blocks – Precision and
Recall [MP4]. Available from https://www.youtube.com/watch?v=mctizdBujk4
3. http://en.wikipedia.org/wiki/Confusion_matrix
4. http://en.wikipedia.org/wiki/Precision_and_recall
5. https://www.creighton.edu/fileadmin/user/HSL/docs/ref/Searching_-
_Recall_Precision.pdf
6. https://www.quora.com/What-is-the-best-way-to-understand-the-terms-precision-and-
recall
7. Kent, A. (1971). Information Analysis and Retrieval, 3rd edn, New York, Becker and
Heys.
8. Lipani, A. (2016). Fairness in Information Retrieval. In: Proceedings of SIGIR’16 Annual
Conference, July 17 – 21, 2016.
9. Oppenheiem, C., Moris, A, Mcknight, C., & Lowley, S. (2000). The evaluation of WWW
search engines. Journal of documentation, 56 (2), 190-211.
10. Sampath Kumar, B.T. & Prakash, J.N. (2009). Precision and Relative Recall of Search
Engines: A Comparative Study of Google and Yahoo. Singapore Journal of Library &
Information Management, Volume 38.
11. Schellekens, M. (2012). Information Retrieval Evaluation. TA: Ang Gao, University
College Cork
12. Silberschatz, Korth and Sudarshan1 (2005). Database System Concepts – 5th edition.
13. Van Rijsbergen, C.J. (1979). Information Retrieval, Butterworths, London, 2nd edition
The Role of ICT in our
day to day life.
Learning Objective
 At the end of this lesson, student should be able
to:-
 Describe the usage of ICT in different sectors.
 Identify the group that benefit from the usage of
ICT in different sectors.
The Role of ICT in our day to day life.

EDUCATION BANKING

ICT

INDUSTRY business
To find useful info E-learning
Eg: Internet

1.EDUCATION

To manage books
Eg: library automation
system
The Role of ICT in our day to day life.

 E-Learning
 Student and lecturer can communicate to each
other if there is something problem or have to
make discussion no matter how far the distance
to each other.
 Internet
 We have an internet to get more information
about our learning.
The Role of ICT in our day to day life.

Online TO withdraw
Banking or check
2. BANKING money eg :
Eg : ATM Machine
Maybank2u.com
The Role of ICT in our day to day life.

 E-Banking
- Online services such as transfer money and pay
bill online (myBank2U.com).

ATM Machine
- to withdraw and to transfer money
The Role of ICT in our day to day life.

Automobile manufacturing
Industry using robotic
Eg : factory

3. INDUSTRY

Aerospace research using


super computers
Eg : NASA
For quality assurance, kinds of testing equipment with computer:
The Role of ICT in our day to day life.

E-commerce.
Buying and
selling
something
from the
internet
4. COMMERCE
Eg : online
payment
Advertising.
Eg : billboard,
magazine
USAGES OF ICT IN EVERYDAY LIFE

Factors Usage Examples


Education -Find useful info -Internet
-to manage books in the -Library Automation System
library
Banking -to withdraw money -ATM Machine
-Online banking -to check anytime &
Anywhere
Industry -Automobile manufacturing -Robotic & Artifial
industry. eg: car Intelligence
-Aerospace research -High-tech machine –
supercomputers
Commerce Buying and selling something - on-line payment
from internet
-for advertising - Billboard,electronics media

-for stock market - Buying and selling of


shares/bonds.

You might also like