0% found this document useful (0 votes)
61 views

6.5.1. Proxy Uris Differ From Server Uris

This document discusses some tricky aspects of proxy requests, including: 1) Proxy URIs contain the full URI while server URIs only contain a partial URI without the scheme, host, or port. 2) Proxies must handle both full URIs from explicit proxies and partial URIs with a Host header from server requests. 3) Intercepting proxies may receive partial URIs if clients are unaware they are talking to a proxy. 4) The Via header tracks the path a message takes through intermediate proxies by listing each proxy or gateway the message passes through.

Uploaded by

Trung Trần
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

6.5.1. Proxy Uris Differ From Server Uris

This document discusses some tricky aspects of proxy requests, including: 1) Proxy URIs contain the full URI while server URIs only contain a partial URI without the scheme, host, or port. 2) Proxies must handle both full URIs from explicit proxies and partial URIs with a Host header from server requests. 3) Intercepting proxies may receive partial URIs if clients are unaware they are talking to a proxy. 4) The Via header tracks the path a message takes through intermediate proxies by listing each proxy or gateway the message passes through.

Uploaded by

Trung Trần
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

6.5.

Tricky Things About Proxy Requests

6.5.1. Proxy URIs Differ from Server URIs


• Web server and web proxy messages have the same syntax,
with one exception. The URI in an HTTP request message
differs when a client sends the request to a server instead of
a proxy.
• When a client sends a request to a web server, the request
line contains only a partial URI (without a scheme, host, or
port), as shown in the following example:

• When a client sends a request to a proxy, however, the


request line contains the full URI. For example:
6.5.2. The Same Problem with Virtual Hosting
• The proxy “missing scheme/host/port” problem is the same problem faced by virtually hosted web servers.
Virtually hosted web servers share the same physical web server among many web sites. When a request comes
in for the partial URI /index.html, the virtually hosted web server needs to know the hostname of the intended
web site. In spite of the problems being similar, they were solved in different ways:
 Explicit proxies solve the problem by requiring a full URI in the request message.
 Virtually hosted web servers require a Host header to carry the host and port information.

6.5.3. Intercepting Proxies Get Partial URIs


• As long as the clients properly implement HTTP, they will send full URIs in requests to explicitly configured proxies.
That solves part of the problem, but there’s a catch: a client will not always know it’s talking to a proxy, because some
proxies may be invisible to the client. Even if the client is not configured to use a proxy, the client’s traffic still may go
through a surrogate or intercepting proxy. In both of these cases, the client will think it’s talking to a web server and
won’t send the full URI:
6.5.4. Proxies Can Handle Both Proxy and Server Requests
• Because of the different ways that traffic can be redirected into proxy servers, general-purpose
proxy servers should support both full URIs and partial URIs in request messages. The proxy
should use the full URI if it is an explicit proxy request or use the partial URI and the virtual Host
header if it is a web server request.

• The rules for using full and partial URIs are:


o If a full URI is provided, the proxy should use it.
o If a partial URI is provided, and a Host header is present, the Host header should be used to determine
the origin server name and port number.
o If a partial URI is provided, and there is no Host header, the origin server needs to be determined in
some other way
6.5.5. In-Flight URI Modification
• Proxy servers need to be very careful about changing the request URI as they forward messages. Slight changes in the URI,
even if they seem benign, may create interoperability problems with downstream servers.

• In general, proxy servers should strive to be as tolerant as possible. They should not aim to be “protocol policemen” looking
to enforce strict protocol compliance, because this could involve significant disruption of previously functional services.

6.5.6. URI Client Auto-Expansion and Hostname Resolution


• Browsers resolve request URIs differently, depending on whether or not a proxy is present. Without a proxy, the browser
takes the URI you type in and tries to find a corresponding IP address. If the hostname is found, the browser tries the
corresponding IP addresses until it gets a successful connection.

• But if the host isn’t found, many browsers attempt to provide some automatic “expansion” of hostnames, in case you typed
in a “shorthand” abbreviation of the host
6.5.7. URI Resolution Without a Proxy
Figure 6-16 shows an example of browser hostname auto-
expansion without a proxy. In steps 2a–3c, the browser looks
up variations of the hostname until a valid hostname is found.
Here’s what’s going on in this figure:
• In Step 1, the user types “oreilly” into the browser’s URI
window. The browser uses “oreilly” as the hostname and
assumes a default scheme of “http://”, a default port of
“80”, and a default path of “/”.
• In Step 2a, the browser looks up host “oreilly.” This fails.
• In Step 3a, the browser auto-expands the hostname and asks
the DNS to resolve “www.oreilly.com.” This is successful.
The browser then successfully connects to www.oreilly.com.
6.5.8. URI Resolution with an Explicit Proxy
When you use an explicit proxy the browser no longer performs any of these convenience
expansions, because the user’s URI is passed directly to the proxy.
• As shown in Figure 6-17, the browser does not auto-
expand the partial hostname when there is an explicit
proxy. As a result, when the user types “oreilly” into the
browser’s location window, the proxy is sent
“http://oreilly/” (the browser adds the default scheme
and path but leaves the hostname as entered).
• For this reason, some proxies attempt to mimic as much
as possible of the browser’s convenience services as
they can, including “www...com” auto-expansion and
addition of local domain suffixes.
6.5.9. URI Resolution with an Intercepting Proxy
• In Step 1, the user types “oreilly” into the browser’s URI location
window.
• In Step 2a, the browser looks up the host “oreilly” via DNS, but the
DNS server fails and responds that the host is unknown, as shown in
Step 2b.
• In Step 3a, the browser does auto-expansion, converting “oreilly”
into “www. oreilly.com.” In Step 3b, the browser looks up the host
“www.oreilly.com” via DNS. This time, as shown in Step 3c, the
DNS server is successful and returns IP addresses back to the
browser.
• In Step 4a, the client already has successfully resolved the hostname
and has a list of IP addresses.
• When the proxy finally is ready to interact with the real origin server
(Step 5b), the proxy may find that the IP address actually points to a
down server
6.6. Tracing Messages

Today, it’s not uncommon for web requests to go through a chain of two or more proxies on their way from the
client to the server (Figure 6-19).

6.6.1. The Via Header The Via header field lists information about each
intermediate node (proxy or gateway) through which a
message passes. Each time a message goes through another
node, the intermediate node must be added to the end of the
Via list.
The Via header field is used to track the forwarding of
messages, diagnose message loops, and identify the protocol
capabilities of all senders along the request/response chain
(Figure 6-20).
6.6.1.1. Via syntax
• The Via header field contains a comma-separated list of waypoints. Each waypoint represents an
individual proxy server or gateway hop and contains information about the protocol and address of
that intermediate node. Here is an example of a Via header with two waypoints:
Via = 1.1 cache.joes-hardware.com, 1.1 proxy.irenes-isp.net

• The formal syntax for a Via header is shown here:


Via = "Via" ":" 1#( waypoint )
waypoint = ( received-protocol received-by [ comment ] )
received-protocol = [ protocol-name "/" ] protocol-version
received-by = ( host [ ":" port ] ) | pseudonym

• Note that each Via waypoint contains up to four components: an optional protocol name (defaults
to HTTP), a required protocol version, a required node name, and an optional descriptive comment
6.6.1.2. Via request and response paths
• Both request and response messages pass through proxies,
so both request and response messages have Via headers.
• Because requests and responses usually travel over the
same TCP connection, response messages travel backward
across the same path as the requests. If a request message
goes through proxies A, B, and C, the corresponding
response message travels through proxies C, B, then A. So,
the Via header for responses is almost always the reverse
of the Via header for requests (Figure 6-21)
6.6.1.3. Via and gateways

Some proxies provide gateway functionality to servers that speak non-HTTP protocols. The
Via header records these protocol conversions, so HTTP applications can be aware of
protocol capabilities and conversions along the proxy chain. Figure 6-22 shows an HTTP
client requesting an FTP URI through an HTTP/FTP gateway.
6.6.1.4. The Server and Via headers
• The Server response header field describes the software used by the origin server. Here are a few examples:
o Server: Apache/1.3.14 (Unix) PHP/4.0.4
o Server: Netscape-Enterprise/4.1
o Server: Microsoft-IIS/5.0

• If a response message is being forwarded through a proxy, make sure the proxy does not modify the Server header. The
Server header is meant for the origin server. Instead, the proxy should add a Via entry.

6.6.1.5. Privacy and security implications of Via

• There are some cases when we want don’t want exact hostnames in the Via string. In general, unless this behavior is
explicitly enabled, when a proxy server is part of a network firewall it should not forward the names and ports of hosts
behind the firewall, because knowledge of network architecture behind a firewall might be of use to a malicious party.
6.6.2. The TRACE Method
• Proxy servers can change messages as the messages are forwarded.
• HTTP/1.1’s TRACE method lets you trace a request message through a chain of proxies, observing what proxies the
message passes through and how each proxy modifies the request message. TRACE is very useful for debugging proxy
flows.

6.6.2.1 Max-Forwards

You might also like