Web Security
Web Security
Web-Page Threats
• SPAM
• Phishing
• Eavesdropping
• SQL Injection
• Hacking
• DoS
• Cross-Site Scripting
• Cookie Poisoning
1. SPAM
• Irrelevant or unsolicited messages sent over
the Internet, typically to a large number of
users, for the purposes of advertising,
phishing, spreading malware, etc.
2. Phishing
• Phishing is the attempt to obtain sensitive information such as usernames,
passwords, and credit card details (and sometimes, indirectly, money),
often for malicious reasons, by masquerading as a trustworthy entity in an
electronic communication.
• Communications purporting to be from popular social web sites, auction
sites, banks, online payment processors or IT administrators are commonly
used to lure unsuspecting victims.
• Phishing emails may contain links to websites that are infected with
malware.
• It often directs users to enter details at a fake website whose look and feel
are almost identical to the legitimate one.
3. EavesDropping
Server-Side Scripting
• All websites need to be hosted (i.e. stored) in a database on a web server.
• Server-side scripting simply refers to any code that facilitates the transfer of data from
that web server to a browser.
• It also refers to any code used to build a database or manage data on the web server
itself.
• Server-side scripts run on the web server, which has the power and resources to run
programs that are too resource intensive to be run by a web browser.
• Server-side scripts are also more secure, because the source code remains on the web
server rather than being temporarily stored on an individual’s computer.
Top Web Development Languages to date
1. HTML5
• HTML is a scripting language and not a programming language itself.
• HTML is the standardized markup language that structures and formats content on the web.
• It is one of the core technologies in use on the Internet and serves as the backbone of all web pages.
2. JavaScript
• A survey conducted by StackOverflow in 2015 shows that JavaScript is actually the most used programming language, slotting
ahead of Java and PHP.
• JavaScript is the programming language that brings animation, games, apps, interactivity and other dynamic effects to life. After
HTML and CSS, it’s the most ubiquitous of the client-side scripts.
• Some JavaScript applications can even run without connecting back to a web server, which means they’ll work in a browser with
or without an Internet connection.
3. PHP
• More than 75% of the top websites use PHP as their server side programming language.
• A general-purpose server-side scripting language. The chief advantages of PHP are that it is open source.
• PHP is most often used by websites with lower traffic demands.
4. Java
• It has numerous modules that aid web development and since it is not platform dependent, using Java and deploying Java
applications becomes all the more easier.
• It is the most used server side programming language, but when it comes to websites that attract high traffic, Java and
JavaScript are clear ahead.
5. Python
• The easiest to write and learn .
• Python is also used extensively as a scripting language thus proving its worth in the web development foray.
6. .NET
• Microsoft came up with the .NET framework in 2000 and even though it is used primarily for systems running
on Windows, that restriction is compensated by .NET’s application in scientific, research level and academic
fields.
• Windows App Development has added another notable armor in .NET’s illustrious cabin, and has given the
language standing ground amongst the web development languages.
7. Ruby
• The reason why everyone is why the Ruby programming language has been gaining admirers exponentially in
the past few years is its ease of use and high utility in creative software and designs with ease and perfection.
• Ruby is similar to Python in its simplicity and Pearl for its programmer friendly interface, so it won’t be wrong
to say that Ruby, in many ways, is a blend of the qualities of Python and Pearl.
8. CSS
• CSS (Cascading Style Sheets) is a style-sheet language that basically allows web developers to “set it and forget
it.”
• Paired with HTML, CSS allows a programmer to define the look and format of multiple webpages at once;
elements like color, layout and fonts are specified in one file that’s kept separate from the core code of the
webpage.
Programming languages used in most
popular web-sites
Source: Wikipedia
Search Engines
• A software system that is designed to search for
information on the World Wide Web.
• Search engines use software called "spiders" and
"crawlers" to routinely scour the web to identify and index
web pages.
• Search engine algorithms take the key elements of a web
page, including the page title, content and keyword
density, and come up with a ranking for where to place the
results on the pages.
• The software used by each search engine works a bit differently.
• The same search conducted with different search engines will yield different
results.
• You may want to try your search in more than one search engine and
compare results.
• Web search engines get their information by web crawling from site to site.
• The "spider" checks for the standard filename robots.txt, addressed to it,
before sending certain information back to be indexed depending on many
factors, such as the titles, page content, JavaScript, Cascading Style
Sheets (CSS), headings, as evidenced by the standard HTML markup of the
informational content, or its metadata in HTML meta tags.
• Indexing means associating words and other definable tokens found on
web pages to their domain names and HTML-based fields. The associations
are made in a public database, made available for web search queries.
• A query from a user can be a single word. The index helps find information
relating to the query as quickly as possible.
• Some of the techniques for indexing, and caching are trade secrets,
whereas web crawling is a straightforward process of visiting all sites on a
systematic basis.
How Search Engines Work
(Sherman 2003)
Crawler
URL1
URL2
URL3 URL4
22
How do search engines
work? elaboration
For a number of reasons crawlers do not cover all of the web –
just a fraction
what is not covered is “invisible web”
23