The Hypertext Transfer Protocol
The Hypertext Transfer Protocol
Core Idea: HTTP (Hypertext Transfer Protocol) is all about communication between a client (like your
web browser) and a server (where websites live). It works like a question-and-answer system.
o The client (e.g., your browser) sends a request to the server. This request includes:
Request Method: What action you want to perform (e.g., GET to retrieve a
page, POST to submit data).
Possible Body Content: Data being sent to the server (e.g., when
submitting a form).
o The server processes the request and sends back a response. This response includes:
Status Line:
Success or Error Code: (e.g., 200 OK for success, 404 Not Found for
an error). These codes tell the client if the request was successful.
Possible Entity-Body Content: The actual data being sent back (e.g.,
the HTML code for a web page, an image, etc.).
There are three common forms of intermediary: proxy, gateway, and tunnel. A proxy is a forwarding
agent, receiving requests for a URI in its absolute form, rewriting all or part of the message, and
forwarding the reformatted request toward the server identified by the URI. A gateway is a receiving
agent, acting as a layer above some other server(s) and, if necessary, translating the requests to the
underlying server's protocol. A tunnel acts as a relay point between two connections without
changing the messages; tunnels are used when the communication needs to pass through an
intermediary (such as a firewall) even when the intermediary cannot understand the contents of the
messages.
URIs have been known by many names: WWW addresses, Universal Document Identifiers, Universal
Resource Identifiers [3], and finally the combination of Uniform Resource Locators (URL) [4] and
Names (URN). As far as HTTP is concerned, Uniform Resource Identifiers are simply formatted strings
which identify--via name, location, or any other characteristic--a resource.
The "http" scheme is used to locate network resources via the HTTP protocol.
URI Comparison
When comparing two URIs to decide if they match or not, a client SHOULD use a case-sensitive octet-
by-octet comparison of the entire
o Multiple TCP connections: The server required multiple TCP connections per client –
one to send data to the client and a new one for each incoming message from the
client.
o High Overhead: Each message from the client to the server had a full HTTP header,
adding unnecessary overhead.
o Complex Client-Side Scripting: The client-side script had to manage the mapping
between outgoing and incoming connections to track replies, increasing complexity.
Single TCP Connection: WebSocket solves these problems by providing a
single, persistent TCP connection for bidirectional communication.
Protocol Overview
Two Parts: The protocol consists of two main phases: the handshake and the
data transfer.
Handshake:
o The client and server perform a handshake to establish the WebSocket
connection. Examples of the client and server handshakes are
provided, showing the specific HTTP headers used.
o Key headers like Upgrade, Connection, Sec-WebSocket-Key, Origin, Sec-
WebSocket-Protocol, and Sec-WebSocket-Version are used in the client's
handshake.
o The server responds with a "101 Switching Protocols" status code and
includes headers like Upgrade, Connection, Sec-WebSocket-Accept,
and Sec-WebSocket-Protocol.
Data Transfer:
o Once the handshake is successful, a two-way communication channel
is established. Both client and server can send data independently.
o Data is transferred in "messages," which are composed of one or more
"frames."
o Frames have a type: text, binary data, or control frames.
Imagine two people want to start a private conversation. They need to make
sure they both know it's a private conversation, and not just regular talking.
That's what the opening handshake does.
1. Client's Request (the "Knock"): The client (like your web browser) sends a
special HTTP request to the server. It's disguised as a normal web request (so
regular web servers can understand it), but it contains some key things:
o Upgrade: websocket: "Hey, I want to upgrade this connection to a
WebSocket!"
o Connection: Upgrade: "Yes, really upgrade!"
o Sec-WebSocket-Key: A random "secret code." This is super important
for security.
o Origin: Where the request is coming from (like your website's address).
This helps prevent other websites from pretending to be you.
o Sec-WebSocket-Protocol: What kind of conversation we want to have.
Like "chat" or "game."
o Sec-WebSocket-Version: Which version of the WebSocket rules we're
using.
2. Server's Response (the "Password"): The server receives the request and,
if it understands WebSockets, responds with a special HTTP response:
o HTTP/1.1 101 Switching Protocols: "Okay, I understand WebSockets,
let's switch!" (101 is a special code meaning "switching protocols").
o Upgrade: websocket: "Yes, upgrading to WebSocket!"
o Connection: Upgrade: "Yes, really upgrading!"
o Sec-WebSocket-Accept: A special code calculated from the client's Sec-
WebSocket-Key. This is how the server proves it understands the
WebSocket protocol and isn't just some random server.
When either side wants to end the WebSocket connection, they perform a
closing handshake. It's much simpler than the opening one:
1. Close Frame: One side (client or server) sends a special "close" message
(called a "Close frame") to the other.
2. Acknowledge: The other side responds with its own "Close frame."
3. Connection Closed: Once both sides have sent and received a "Close
frame," they close the TCP connection.