Chap 1 Web Essentials
Chap 1 Web Essentials
JEFFREY C. JACKSON
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
INTRODUCTION Server The software that distributes the information and the machine where the information and software reside is called the server. provides requested service to client e.g., Web server sends requested Web page
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Client:
The software that resides on the remote machine, communicates with the server, fetches the information, processes it, and then displays it on the remote machine is called the client. initiates contact with server (speaks first) typically requests service from server Web: client implemented in browser
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Web server: Software that delivers Web pages and other documents to browsers using the HTTP protocol
Web Page: A web page is a document or resource of information that is suitable for the World Wide Web and can be accessed through a web browser.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Website: A collection of pages on the World Wide Web that are accessible from the same URL and typically residing on the same server
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
1.1The Internet
Technical origin: ARPANET (late 1960s)
Launched in 1969 Project of U.S Dept of Defense(DoD) One of earliest efforts to network heterogeneous(Different manufactures & Different OS), geographically dispersed computers Email first available on ARPANET in 1972 (and quickly very popular!)
The Advanced Research Projects Agency Network (ARPANET) was one of the world's first operational packet switching networks, the first network to implement TCP/IP.
The network was initially funded by the Advanced Research Projects Agency (ARPA, later DARPA) within the U.S. Department of Defense for use by its projects at universities and research laboratories in the US.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
The Internet
Open-access networks
Regional university networks (e.g., SURAnet) CSNET for CS departments with no ARPANET access. Later ARPA Internet allowed to access outside networks such as CSNET. The Connection Between CSNET to ARPA is made by Phonenet(MODEM) approach. This connection is asynchronous. This involves long distance calls
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Open-access networks A full-service network provider offering Internet solutions for business small and large, residential users and non-profit groups. Regional Universities Network (RUN) Is a network of six universities primarily from regional Australia, as well as campuses in the Australian capital cities and some international campuses Southeastern Universities Research Association network (SURAnet) provided networking services for universities and industries. SURAnet was one of the first and one of the largest Internet providers in the United States.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
The Computer Science Network (CSNET) was a computer network that began operation in 1981 in the United States. Its purpose was to extend networking benefits, for computer science departments at academic and research institutions that could not be directly connected to ARPANET, due to funding or authorization limitations. CSNET was funded by the National Science Foundation for an initial three-year period from 1981 to 1984.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Synchronous communication is said to occur when two parties communicate in real-time. Examples of synchronous communication include telephone calls and two-way radio communication.
In contrast, asynchronous communication is non real-time communication. Examples might be email, blog and message board postings, and especially text messaging.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Operated at only 56kbits/sec No of machines connected increased Upgraded to 1.5Mbit/s in 1988 45Mbits/s in 1991
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
The Internet
Internet: the network of networks connected via the public backbone and communicating using TCP/IP communication protocol Global Communication Network
Commercial Internet dial-up access offered
Economic Increase network usage Reduced unit cost
Backbone initially supplied by NSFNET, privately funded (ISP fees) beginning in 1995 Private telecommunication firms
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Single Protocol
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Designed for use both within local area networks (LANs) and between networks IP address:
32-bit number (in IPv4) Each device on the internet has one or more IP addresses Written as four dot-separated bytes, e.g. 192.0.34.166 Each decimal number represents one byte of the IP address
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
IP function: transfer data from source device to destination device IP source software creates a packet representing the data Header: source and destination IP addresses, length of data, etc. Data itself If destination is on another LAN, packet is sent to a gateway that connects to more than one network Gateway is a device that is connected to the source computers network as well as to at least one other network. The sequence of computers that a packet travels through from source to destination is known as its route.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
How does the computer choose the next computer in the route for a packet?
A separate protocol BGP-4 is used to pass network connectivity information between gateways so that each computer can choose a good next hop for each packet it receives.
Limitations of IP:
No guarantee of packet delivery (packets can be dropped) Unreliable Communication is one-way (source to destination)
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
CheckSum Calculation
Checksum Calculation Sender side : 1. It treats segment contents as sequence of 16-bit integers. 2. All segments are added. Let's call it sum. 3. Checksum : 1's complement of sum.(In 1's complement all 0s are converted into 1s and all 1s are converted into 0s). 4. Sender puts this checksum value in UDP checksum field. Receiver side : 1. Calculate checksum 2. All segments are added and than sum is added with sender's checksum. 3. Check that any 0 bit is presented in checksum. If receiver side checksum contains any 0 than, error is detected. So,the packet is discarded by receiver.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
SENDER 1011101110111011 0000111100001111 DATA 1100101011001010 ( sum of all DATA) 0011010100110101 (1s Complement ) Header Checksum- 0011010100110101
Receiver: 1011101110111011 0000111100001111 DATA 1100101011001010 ( sum of all DATA) 0011010100110101 (Checksum) 1111111111111111 (If any bit 0 error Occurred)
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
IP
Source Network 1
Gateway
Destination Gateway
Network 2
Network 3
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
IP
Source LAN 1
Gateway
Destination Gateway
Internet Backbone
LAN 2
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
is a higher-level protocol extends IP to provide additional functionality Reliable communication based on the concept of communication TCP adds concept of a connection on top of IP Provides guarantee that packets delivered Provide two-way (full duplex) communication
A and B both send messages to one another at the same time.
Reliable data transmission by demanding an ACK for each packet it sends via IP Splitting longer messages into shorter ones Reassembling on receiver side.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
TCP
Establish connection.
{ { {
Source
Heres a packet.
Destination Got it. Heres a packet. Heres a resent packet. Got it.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
TCP also adds concept of a port The port concept allows TCP to communicate with many different applications on a machine. TCP header contains port number representing an application program on the destination computer Some port numbers have standard meanings
Example: port 25 is normally used for email transmitted using the Simple Mail Transfer Protocol (SMTP)
TCP
Other port numbers are available first-come-first served to any application Assigned by IANA(Internet Assigned numbers Authority) 0-1023 requested only by the applications that are run by the system at boot-up 1024-65535 used by the first application on a system
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
TCP Header
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
TCP
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Top level domains divided into sub domains Domains are divided into second-level domains, which can be further divided into sub domains, etc.
E.g., in www.example.com, example is a secondlevel domain
Assignment of second level domain by registry operator A host name plus domain name information is called the fully qualified domain name of the computer
Above, www is the host name, www.example.com is the FQDN
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Service names and port numbers are used to distinguish between different services that run over transport protocols such as TCP, UDP When a service (server program) initially is started, it is said to bind to its designated port number. As any client program wants to use that server, it also must request to bind to the designated port number. Port numbers are from 0 to 65535. Ports 0 to 1024 are reserved for use by certain privileged services. For the HTTP service, port 80 is defined as a default and it does not have to be specified in the Uniform Resource Locator (URL). A registry operator (also called a Network Information Center (NIC)) is an entity that maintains the database of domain names for a given top-level domain and generates the zone files which convert domain names to IP addresses.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Zone file A file on a root server that contains domain name registration information. Master files contains all information related to one domain
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
nslookup program provides command-line access to DNS (on most systems to query the Internet)
looking up a host name given an IP address is known as a reverse lookup Recall that single host may have mutliple IP addresses. Only one of the names will be returned by a reverse lookup. Address returned is the canonical IP address specified in the DNS system.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
IP ~ the telephone network TCP ~ calling someone who answers, having a conversation, and hanging up UDP ~ calling someone and leaving a message DNS ~ directory assistance (names with numbers)
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Many protocols build on TCP Telephone analogy: TCP specifies how we initiate and terminate the phone call, but some other protocol specifies how we carry on the actual conversation Some examples: SMTP (email) FTP (file transfer) HTTP (transfer of Web documents) Primary TCP-based protocol used for communication between web servers and browsers called HTTP IP is key component in the definition of Internet HTTP - WWW
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Unique feature of Web: support for hypertext (text containing links) Communication via Hypertext Transport Protocol (HTTP) Document representation using Hypertext Markup Language (HTML)
The Web is the collection of machines (Web servers) on the Internet that provide information, particularly HTML documents, via HTTP. Machines that access information on the Web are known as Web clients. A Web browser is software used by an end user to access the Web.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Communication Protocol HTTP is based on the request-response communication model: Client sends a request Server sends a response Format of the message is dictated by HTTP HTTP send the message using TCP HTTP is a stateless protocol: The protocol does not require the server to remember anything about the client requests. Each request is executed independently, without any knowledge of the requests that came before it
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
HTTP request Message The information transmitted using HTTP is often entirely text (readable form) Start line followed by a message header and optional message body Start line Example: GET / HTTP/1.1
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
HTTP request Message The information transmitted using HTTP is often entirely text (readable form) Connect to a web server using telnet
Connect
Send Request
Receive Response
{ {
$ telnet www.example.org 80 Trying 192.0.34.166 Connected to www.example.com (192.0.34.166). Escape character is ^]. GET / HTTP/1.1 Host: www.example.org
HTTP/1.1 200 OK Date: Thu, 09 Oct 2003 20:30:49 GMT
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
1.4.2 HTTP version 1997 HTTP 1.1 was formally defined The version string for HTTP/1.1 must appear in the start line exactly as shown with all capital letters and no embedded white space
1.4.3 Request-URI
Second part of start line Concatenation of the string http:// Value of the host header field www.example.org Request-URI forms a string known as URI
An URI is an identifier that is intended to be associated with a particular resource on the WWW.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
URI is case sensitive generally written in lowercase URI representing the location of a resource on the web called the URL. Another type URN designed to be a unique name for a resource. Syntax: scheme : scheme-depend-part Ex: In http://www.example.com/ the scheme is http
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
URIs are of two types: Uniform Resource Name (URN) Can be used to identify resources with unique names, such as books (which have unique ISBNs) Scheme is urn Ex: Three colon separated parts
scheme name Namespace identifier Namespace specific string
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Uniform Resource Locator (URL) Specifies location at which a resource can be found In addition to http, some other URL schemes are https, ftp, mailto, and file
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Method Description OPTIONS Return a list of HTTP methods used to access the resource GET Retrieves the requested URI, including the headers and body (that is, the content). HEAD Retrieves only the headers for the requested URI and not the body. POST Sends information to the server from HTML forms. PUT Uploads the file indicated in the URI to a server. DELETE Deletes the URI from a server. TRACE Return a copy of the complete HTTP request message for test purposes.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
1.4.4 Header fields and MIME Types The host header field is required in every HTTP/1.1 request message Each header field begins with a field name such as host followed by a colon and then field value Header field structure: field name : field value Syntax Field name is not case sensitive Field value may continue on multiple lines by starting continuation lines with white space Field values may contain MIME types, quality values, and wildcard characters (*s)
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
HTTP Request start line, 10 header fields and a short message body
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Header field features: First header names not case sensitive Header field value wrap onto several lines Header field values using MIME types Many header field values use quality values to indicate preferences Quality value specified by a string of the form q=num Num is a decimal number between 0 and 1
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Multipurpose Internet Mail Extensions (MIME) Standard used to pass variety of information includes graphics and applications through e-mails as well as through Internet message protocols. Has two parts Content type of the message case insensitive string Subtype or private type indicated by x- or X MIME content type syntax: top-level type / subtype Examples: text/html, image/jpeg
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
HTTP Quality Values and Wildcards Example header field with quality values: accept: text/xml,text/html;q=0.9, text/plain;q=0.8, image/jpeg, image/gif;q=0.2,*/*;q=0.1 Quality value applies to all preceding items Higher the value, higher the preference Note use of wildcards to specify quality 0.1 for any MIME type not specified earlier
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Common header fields: Host: host name from URL (required) User-Agent: type of browser sending request Accept: MIME types of acceptable documents Connection: value close tells server to close connection after single request/response Content-Type: MIME type of (POST) body, normally application/x-www-form-urlencoded Content-Length: bytes in body Referer: URL of document containing link that supplied URI for this HTTP request
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Status code Three-digit number First digit is class of the status code: 1=Informational provide information to client. 2=Success 3=Redirection (alternate URL is supplied) 4=Client Error Request not valid 5=Server Error Error occurred during server processing
Other two digits provide additional information
200 301 307 401 OK Moved Permanently Temporary redirect Unauthorized
403
404 500
Forbidden
Not Found Internal Server Error
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
1.5.3Cache Control
A cache is a local copy of information obtained from some other source A copy of information placed in cache to improve system performance Ex: icon appearing multiple times in a Web page Advantages Most web browsers use cache to store requested resources so that subsequent requests to the same resource will not necessarily require an HTTP request/response HTTP caching when successful leads to quicker display by the browser Reduced network communication Reduce load on the web Server Drawbacks Information in a cache become invalid
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Validating cached resource: Send HTTP HEAD request and check LastModified or ETag header in response Compare current date/time with Expires header sent in response containing resource
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
UNICODE.
Character Encoding is a bit string that must be decoded into a code-point integer that is then mapped to a character according to the definition
provided byTechnologies: some character set. Jackson, Web A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
An encoding represents code points using variable-length byte strings Most common examples are Unicode-based encodings UTF-8 and UTF-16 IANA maintains complete list of Internet-recognized character sets/encodings Some header fields have character set values: Accept-Charset: request header listing character sets that the client can recognize Ex: accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Content-Type: can include character set used to represent the body of the HTTP message Ex: Content-Type: text/html; charset=UTF-8
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Typical US PC produces ASCII documents US-ASCII character set can be used for such documents, but is not recommended
Many possible web clients: Text-only browser (lynx) Mobile phones Robots (software-only clients, e.g., search engine crawlers) not designed to be used directly by humans at all. etc.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
User agent Any web client that is designed to directly support user access to web servers.
Netscape was acquired by America online Launched Mozilla Firefox All the major modern browsers support a common set of basic user features Provide similar support for HTTP communication
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Window split into several rectangular regions known as Bars 5 Standard region in Mozilla 1.4 Primary region Client area display document Title bar title assigned by document author to the document currently displayed within the client area Menu bar dropdown menus and GUI
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Navigation toolbar push-button controls(Back, Forward Stop Print and Reload) Contains a text box known as Location bar User can enter the url in order to request the browser to display the document located at the specified URL. Status bar displays messages and icons related to the status of the browser Browser make HTTP request on behalf of the user Browser Primary tasks: Reformat the URL entered as a valid HTTP request message If server specified by host name, use DNS Establish TCP connection using IP of the specified address
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Send HTTP request over TCP connection and wait for response Display the document contained in the response Render (appropriately display) documents returned by a server
1.6.2 URLs
A HTTP scheme URL consist of a number of pieces
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Browser uses authority to connect via TCP Request-URI included in start line (/ used for path if none supplied) Fragment identifier not sent to server (used to scroll browser client area) 1.6.3 User Controllable Features Graphical Browsers features: Save : Most documents can be saved by the user to the client machines file system.
File|Save Page As
Find in Page: Standard documents (text and HTML) can be searched with a function similar to word processors
Edit | Find in This Page
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Automatic Form Filling : Browser can remember information entered on certain forms(billing address, phone numbers) Edit | Save Form Info Edit | Fill in Form Tools | Form Manager Preferences: User customize browser functionality in wide variety of ways Edit| Preferences Some Preference Settings are Accept-Language Navigator | Languages Lang for web page Default character set/encoding The Char set for the web documents Navigator|languages Character Coding Cache Properties Amount of local storage allocated to the cache Advanced |Cache Set Cache options Http Settings The version of Http used and whether or not the client will keep connections alive Advanced|HTTP NetworkingDirect Connections Options
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Style definition View|Text Zoom View|Use Style Document meta-Information View|page Source Raw HTML View|Page Info meta information Themes Look of one or more browser bars(Skin) View | Apply Theme|Get New Themes History Automatically maintain a list of all pages visited within the last several days Go|History Bookmarks Save the URL for that page for an indefinite length of time 1.6.4 Additional Functionality Automatic URL Completion Script Execution [ Browsers run programs to perform variety of tasks , validation] Event Handling [Clicking on a link or button occurrence of event, Button Clicks and mouse movement] Management of form GUI: Web page contains a form with fill-in fields browser allow user to perform std text-editing functions, button image,Text Cursor]
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Secure Communication: User send sensitive information to server and the browser encode this information and prevent it from any other machines, Credit Card Number] Plug-in Execution: Support Plug-in Protocol.Display of non-HTML documents (e.g., PDF) via plug-ins Help|About Plug-ins 1.7 WEB SERVERS Tomcat 5.0 1.7.1 Server Features Accept HTTP request from web clients and return an appropriate resource in the HTTP response
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Basic functionality: Server calls on TCP software and waits for connection req to one or more ports When a connection request is received , the server dedicates a subtask(Single copy of server software handling a single client connection) Subtask establish connection and receives request Subtask examines the host header field to determine the host and invokes software for this host Virtual host software Map Request-URI to specific resource on the server.
Log information about the request and response such as IP address and the status code in a plain-text file. If the TCP connection is kept alive , the server subtask continues to monitor the connection, the client send another request or initiates a connection close. Few Definitions All modern servers concurrently process multiple requests Multiple copies of the server running simultaneously(Concurrency) Subtask Single copy of server software handling a single client connection Virtual Host HTTP request include a host header field Multiple host names mapped by a DNS to a single IP address Web server determine which virtual host is being requested by examining the host header field.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
A number of IIS and Apache server run java programs When running a java program , both servers are configured to run the program by using a separate software called Servlet Container Servlet Container provides JVM that runs java programs(known as Servlet) It provides communication between the servlet and the Apache or IIS Server Tomcat is a popular free open-source servlet container by Apache software foundation Tomcat can also run as a standalone web server that communicates directly with web clients Tomcat 5.0 Web Server
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Coyote Provides HTTP 1.1 communication Catalina Actual Servlet Container Coyote parameters affecting External Communication: IP addresses and TCP ports Number of subtasks created when server initialized Max number of threads allowed to exist simultaneously
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Max no of TCP connection request that will be queued if server is running its max no of threads. If queue full the received connection request is refused. Keep-alive time for inactive TCP connections Settings of the parameter affect the performance of the server. Tuning the Server Changing the values of these and similar parameters in order to optimize performance Tuning is done by trial and error Load generation or stress test tools used to simulate request to a web server helpful for experimenting with tuning parameters
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Internal Catalina parameters affect functionality: Which client machines may send HTTP request to the server Which virtual host are listening for TCP connection What logging will be performed How the request URI mapped to servers resources Password protection of resources Use of server-side caching
Install Tomcat 5.0 at the default port 8080 Open browser browse to the URL
http://localhost:8080
Tomcat included in JWSDP JWSDP Service entry in the list on left side Click on the icon to reveal the associated server components Service has Five Components: Connector, Host, Logger, Realm, and Valve
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Connector is a coyote component handles HTTP communication Clicking on the connector will produce the window containing the dropdown menus of possible action that can be performed for this component
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Connector Attributes When you create or modify any type of Connector, the attributes shown in flowing table may be set, as needed. Common Connector Attributes
Attribute
Accept Count
Description
Length of TCP Connection wait queue
The number of milliseconds this Connector will Connection wait, after accepting a connection. The default Timeout value is 60000 (i.e. 60 seconds). Specifies which address will be used for listening IP Address on the specified port, for servers with more than one IP address.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Port Number
Port Number on which this connection will listen for TCP connection request
The number of request processing threads that will be created when this Connector is first started. The default value is 5.
Minimum
Maximum
The maximum number of request processing threads to be created by this Connector, which therefore determines the maximum number of simultaneous requests that can be handled. If not specified, this attribute is set to 75.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Configuring Host Elements The Host element represents a virtual host, which is an association of a network name for a server (such as www.mycompany.com) with the particular server on which Tomcat is running. Host Attributes The attributes shown in following table may be viewed, set, or modified for a Host.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Host Attributes Attribute Name Description FQDN that clients will use to access the virtual host
Directory Containing Web Applications The Application Base directory for this virtual host. This is the path name of a directory that may contain Applicati Web applications to be deployed on this virtual host. on Base You may specify an absolute path name for this directory, or a path name that is relative to the directory under which Tomcat is installed. Deploy on startup Boolean value indicating whether or not web applications should be automatically initialized when the server starts
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Web Applications Collection of files and programs that work together to provide particular functions to web users Absolute path name Traces the path from the /(root) directory. Absolute path names always begin with the slash (/) symbol. Relative path name Traces the path from the current directory through its parent or its subdirectories and files.
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
1.7.5 Logging
Web server logs record information about server activity Access log is a file that records information about every HTTP request processed by the server Message logs variety of debugging and other information generated by web server Access logging is performed by adding a valve component The Primary fields are given in the table: Logger Attributes Attribute Directory Pattern Description Where log file will be written Information to be written to log
The prefix added to the start of each log Prefix file's name. Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
The suffix added to the end of each log file's name. Whether or not all logged messages are to be date and time stamped. Set to True and false Whether IP address or host name to be written in log file
Tomcat writes the log information in a log file which in a plain text format. In general, the log entry has the following format: %h %l %t %r %s %b %h - Remote host name %l - Remote logical user name %t - Date and time, in Common Log Format %r - First line of the request URI %s - HTTP status code of the response %b - Bytes sent in body of response, excluding HTTP headers,
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
in
by
in or
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Since HTTP messages typically travel over a public network, private information (such as credit card numbers) should be encrypted to prevent eavesdropping https URL scheme tells browser to use encryption Common encryption standards:
Secure Socket Layer (SSL) Transport Layer Security (TLS)
97
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Secure Servers
Id like to talk securely to you (over port 443)
HTTP Requests
HTTP Requests
TLS/ SSL
TLS/ SSL
Web Server
Heres an encrypted HTTP request HTTP Responses Heres an encrypted HTTP response
HTTP Responses
98
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Browser
Real www.example.org
99
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Browser
Real www.example.org
100
Jackson, Web Technologies: A Computer Science Perspective, 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0