Chapter 1.
Web Techniques
HTTP Basics
The web runs on HTTP, the HyperText Transfer Protocol.
This protocol governs how web browsers request files from web servers and how the
servers send the files back.
When a web browser requests a web page, it sends an HTTP request message to a web
server.
The request message always includes some header information, and it sometimes also
includes a body.
The web server responds with a reply message, which always includes header
information and usually contains a body.
GET /index.html HTTP/1.1
User-Agent: Mozilla/5.0 (Windows 2000; U) Opera 6.0 [en]
Accept: image/gif, image/jpeg, text/*, */*
HTTP Request Header
The first line of an HTTP request looks like this:
GET /index.html HTTP/1.1
This line specifies an HTTP command, called a method, followed by the address of a
document and the version of the HTTP protocol being used.
In this case, the request is using the GET method to ask for the index.html document
using HTTP 1.1.
After this initial line, the request can contain optional header information that gives the
server additional data about the request. For example:
User-Agent: Mozilla/5.0 (Windows 2000; U) Opera 6.0 [en]
Accept: image/gif, image/jpeg, text/*, */*
The User-Agent header provides information about the web browser, while the Accept
header specifies the MIME types that the browser accepts.
After any headers, the request contains a blank line, to indicate the end of the header
section.
The request can also contain additional data, if that is appropriate for the method being
used (e.g., with the POST method, as we'll discuss shortly).
If the request doesn't contain any data, it ends with a blank line.
The web server receives the request, processes it, and sends a response.
The first line of an HTTP response looks like this:
HTTP/1.1 200 OK
Date: Sat, 26 Jan 2002 20:25:12 GMT
Server: Apache 1.3.22 (Unix) mod_perl/1.26 PHP/4.1.0
Content-Type: text/html Content-Length: 141
HTTP Response Header
HTTP/1.1 200 OK
This line specifies the protocol version, a status code, and a description of that code. In
this case, the status code is "200", meaning that the request was successful (hence the
description "OK").
After the status line, the response contains headers that give the client additional
information about the response. For example:
Date: Sat, 26 Jan 2002 20:25:12 GMT
Server: Apache 1.3.22 (Unix) mod_perl/1.26 PHP/4.1.0
Content-Type: text/html Content-Length: 141
The Server header provides information about the web server software, while the
Content-Type header specifies the MIME type of the data included in the response.
After the headers, the response contains a blank line, followed by the requested data, if
the request was successful.
The two most common HTTP methods are GET and POST.
The GET method is designed for retrieving information, such as a document, an image,
or the results of a database query, from the server. The GET method is what a web
browser uses when the user types in a URL or clicks on a link.
The POST method is meant for posting information, such as a credit-card number or
information that is to be stored in a database, to the server.
When the user submits a form, either the GET or POST method can be used, as specified
by the method attribute of the form tag.