0% found this document useful (0 votes)
100 views

High-Performance Web Sites

Uploaded by

Sunil Tadvi
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

High-Performance Web Sites

Uploaded by

Sunil Tadvi
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

practice

doi:10.1145/ 1409360.1409374
consumption. But how much does the
Want to make your Web site fly? backend affect the user experience in
terms of latency?
Focus on frontend performance. The Web applications listed here are
some of the most highly tuned in the
by Steve Souders world, and yet they still take longer to
load than we’d like. It almost seems as

High-
if the high-speed storage and optimized
application code on the backend have
little impact on the end user’s response
time. Therefore, to account for these

Performance
slowly loading pages we must focus on
something other than the backend: we
must focus on the frontend.

Web Sites
The Importance of
Frontend Performance
Figure 1 illustrates the HTTP traffic sent
when your browser visits iGoogle with
an empty cache. Each HTTP request is
represented by a horizontal bar whose
position and size are based on when
the request began and how long it took.
The first HTTP request is for the HTML
document (http://www.google.com/ig).
Google Maps, Yahoo! Mail, Facebook, MySpace, As noted in Figure 1, the request for the
HTML document took only 9% of the
YouTube, and Amazon are examples of Web sites overall page load time. This includes
built to scale. They access petabytes of data sending the time for the request to be sent from
terabits per second to millions of users worldwide. the browser to the server, for the server
to gather all the necessary information
The magnitude is awe-inspiring. on the backend and stitch that together
Users view these large-scale Web sites from a as HTML, and for that HTML to be sent
back to the browser.
narrower perspective. The typical user has megabytes The other 91% percent is spent on
of data that they download at a few hundred kilobits the frontend, which includes everything
per second. Users are less interested in the massive that the HTML document commands
the browser to do. A large part of this is
number of requests per second being served, caring fetching resources. For this iGoogle page
more about their individual requests. As they use there are 22 additional HTTP requests:
two scripts, one stylesheet, one iframe,
these Web applications they inevitably ask the same and 18 images. Gaps in the HTTP pro-
question: “Why is this site so slow?” file (places with no network traffic) are
The answer hinges on where development teams where the browser is parsing CSS, and
parsing and executing JavaScript.
focus their performance improvements. Performance The primed cache situation for
for the sake of scalability is rightly focused on the iGoogle is shown in Figure 2. Here
there are only two HTTP requests: one
backend. Database tuning, replicating architectures, for the HTML document and one for a
customized data caching, and so on, allow Web dynamic script. The gap is even larger
servers to handle a greater number of requests. because it includes the time to read the
cached resources from disk. Even in
This gain in efficiency translates into reductions in the primed cache situation, the HTML
hardware costs, data center rack space, and power document accounts for only 17% of the

36 comm unicatio ns o f the acm | d ec em ber 2008 | vo l . 5 1 | n o. 1 2


PROGRESS

160 200 3
120 240 Temp 2 4
80 280
1 5
40 320
Speed RPM
0 0

overall page load time. frontend Performance 12. Remove duplicate scripts
This situation, in which a large per- Best Practices 13. Configure ETags
centage of page load time is spent on the Through research and consulting with 14. Make Ajax cacheable
frontend, applies to most Web sites. Ta- development teams, I’ve developed a A detailed explanation of each rule is
ble 1 shows that eight of the top 10 Web set of performance improvements that the basis of my book, high Performance
sites in the U.S. (as listed on Alexa.com) have been proven to speed up Web pag- Web Sites.2 What follows is a brief sum-
spend less than 20% of the end user’s es. A big fan of harvey Penick’s Little Red mary of each rule.
response time fetching the HTML docu- Book1 with advice like “Take Dead Aim,”
ment from the backend. The two excep- I set out to capture these best practices Rule 1: make fewer httP Requests
tions are Google Search and Live Search, in a simple list that is easy to remember. As the number of resources in the page
which are highly tuned. These two sites The list has evolved to contain the fol- grows, so does the overall page load time.
download four or fewer resources in the lowing 14 prioritized rules: This is exacerbated by the fact that most
empty cache situation, and only one re- 1. Make fewer HTTP requests browsers only download two resources
quest with a primed cache. 2. Use a content delivery network at a time from a given hostname, as
The time spent generating the HTML 3. Add an Expires header suggested in the HTTP/1.1 specification
document affects overall latency, but 4. Gzip components (http://www.w3.org/Protocols/rfc2616/
for most Web sites this backend time is 5. Put stylesheets at the top rfc2616-sec8.html#sec8.1.4).a Several
iLLusTraTioN by Nik sch uL z

dwarfed by the amount of time spent on 6. Put scripts at the bottom techniques exist for reducing the num-
the frontend. If the goal is to make the 7. Avoid CSS expressions
user experience faster, the place to focus 8. Make JavaScript and CSS external
a Newer browsers open more than two connec-
is on the frontend. Given this new focus, 9. Reduce DNS lookups tions per hostname including Internet Ex-
the next step is to identify best practices 10. Minify JavaScript plorer 8 (six), Firefox 3 (six), Safari 3 (four), and
for improving frontend performance. 11. Avoid redirects Opera 9 (four).

dec e mb e r 2 0 0 8 | vo l. 51 | n o. 1 2 | c om m u n ic at ion s of t he acm 37


practice

Figure 1. iGoogle HTTP traffic with an empty cache. browser downloads and caches the
page’s resources. The next time the
9% 91% user visits the page, the browser
checks to see if any of the resources
can be served from its cache, avoiding
time-consuming HTTP requests. The
browser bases its decision on the re-
source’s expiration date. If there is an
expiration date, and that date is in the
future, then the resource is read from
disk. If there is no expiration date, or
that date is in the past, the browser is-
sues a costly HTTP request. Web devel-
opers can attain this performance gain
by specifying an explicit expiration
date in the future. This is done with the
Expires HTTP response header, such
as the following:
Expires: Thu, 1 Jan 2015 20:00:00
GMT
Figure 2. iGoogle HTTP traffic with a primed cache.

Rule 4: Gzip Components


17% 83%
The amount of data transferred over the
network affects response times, espe-
cially for users with slow network con-
nections. For decades developers have
used compression to reduce the size of
Table 1. Percentage of time spent on the backend. files. This same technique can be used
for reducing the size of data sent over
the Internet. Many Web servers and
Web Site Empty Cache Primed Cache Web hosting services enable compres-
http://www.aol.com/ 3% 3% sion of HTML documents by default,
http://www.ebay.com/ 5% 19% but compression shouldn’t stop there.
http://www.facebook.com/ 5% 19%
Developers should also compress other
types of text responses, such as scripts,
http://www.google.com/search?q=flowers 53% 100%
stylesheets, XML, JSON, among others.
http://search.live.com/results.aspx?q=flowers 33% 100% Gzip is the most popular compression
http://www.msn.com/ 2% √√6% technique. It typically reduces data sizes
http://www.myspace.com/ 2% 2% by 70%.
http://en.wikipedia.org/wiki/Flowers 6% 9%
Rule 5: Put Stylesheets at the Top
http://www.yahoo.com/ 3% 4%
Stylesheets inform the browser how
http://www.youtube.com/ 2% 3% to format elements in the page. If
stylesheets are included lower in the
page, the question arises: What should
the browser do with elements that it can
ber of HTTP requests without reducing collection of distributed Web servers render before the stylesheet has been
page content: used to deliver content to users more downloaded?
˲˲ Combine multiple scripts into a efficiently. Examples include Akamai One answer, used by Internet Ex-
single script. Technologies, Limelight Networks, plorer, is to delay rendering elements
˲˲ Combine multiple stylesheets into SAVVIS, and Panther Express. The main in the page until all stylesheets are
a single stylesheet. performance advantage provided by a downloaded. But this causes the page
˲˲ Combine multiple CSS background CDN is delivering static resources from to appear blank for a longer period of
images into a single image called a CSS a server that is geographically closer to time, giving users the impression that
sprite (see http://alistapart.com/arti- the end user. Other benefits include the page is slow. Another answer, used
cles/sprites). backups, caching, and the ability to bet- by Firefox, is to render page elements
ter absorb traffic spikes. and redraw them later if the stylesheet
Rule 2: Use a Content changes the initial formatting. This
Delivery Network Rule 3: Add an Expires Header causes elements in the page to “flash”
A content delivery network (CDN) is a When a user visits a Web page, the when they’re redrawn, which is dis-

38 com municatio ns o f th e ac m | d ec em ber 2008 | vo l . 5 1 | n o. 1 2


practice

ruptive to the user. The best answer is on every page view. This has a negative tiple times. In some cases the browser
to avoid including stylesheets lower in impact on response times and increas- will request the file multiple times. This
the page, and instead load them in the es the bandwidth used from your data is inefficient and causes the page to
HEAD of the document. center. For most Web sites, it’s better load more slowly. This obvious mistake
to serve JavaScript and CSS via exter- would seem uncommon, but in a review
Rule 6: Put Scripts at the Bottom nal files, while making them cacheable of U.S. Web sites it could be found in two
External scripts (typically, “.js” files) with a far future Expires header as ex- of the top 10 sites. Web sites that have
have a bigger impact on performance plained in Rule 3. a large number of scripts and a large
than other resources for two reasons. number of developers are most likely to
First, once a browser starts downloading Rule 9: Reduce DNS Lookups suffer from this problem.
a script it won’t start any other parallel The Domain Name System (DNS) is like
downloads. Second, the browser won’t a phone book: it maps a hostname to Rule 13: Configure ETags
render any elements below a script un- an IP address. Hostnames are easier for Entity tags (ETags) are a mechanism
til the script has finished download- humans to understand, but the IP ad- used by Web clients and servers to verify
ing. Both of these impacts are felt when dress is what browsers need to establish that a cached resource is valid. In other
scripts are placed near the top of the a connection to the Web server. Every words, does the resource (image, script,
page, such as in the HEAD section. Oth- hostname that’s used in a Web page stylesheet, among others) in the brows-
er resources in the page (such as images) must be resolved using DNS. These er’s cache match the one on the server?
are delayed from being downloaded and DNS lookups carry a cost; they can take If so, rather than transmitting the entire
elements in the page that already exist 20–100 milliseconds each. Therefore, file (again), the server simply returns
(such as the HTML text in the document it’s best to reduce the number of unique a 304 Not Modified status telling the
itself) aren’t displayed until the earlier hostnames used in a Web page. browser to use its locally cached copy.
scripts are done. Moving scripts lower In HTTP/1.0, validity checks were based
in the page avoids these problems. Rule 10: Minify JavaScript on a resource’s Last-Modified date: if
As described in Rule 4, compression is the date of the cached file matched the
Rule 7: Avoid CSS Expressions the best way to reduce the size of text file on the server, then the validation
CSS expressions are a way to set CSS files transferred over the Internet. The succeeded. ETags were introduced in
properties dynamically in Internet Ex- size of JavaScript can be further reduced HTTP/1.1 to allow for validation schemes
plorer. They enable setting a style’s by minifying the code. Minification is based on other information, such as ver-
property based on the result of execut- the process of stripping unneeded char- sion number and checksum.
ing JavaScript code embedded within acters (comments, tabs, new lines, extra ETags don’t come without a cost.
the style declaration. The issue with white space, and so on) from the code. They add extra headers to HTTP re-
CSS expressions is that they are evalu- Minification typically reduces the size quests and responses. The default ETag
ated more frequently than one might of JavaScript by 20%. External scripts syntax used in Apache and IIS makes
expect—potentially thousands of should be minified, but inline scripts it likely that the validation will errone-
times during a single page load. If the also benefit from this size reduction. ously fail if the Web site is hosted on
JavaScript code is inefficient it can cause multiple servers. These costs impact
the page to load more slowly. Rule 11: Avoid Redirects performance, making pages slower and
Redirects are used to map users from increasing the load on Web servers.
Rule 8: Make JavaScript one URL to another. They’re easy to This is an unnecessary loss of perfor-
and CSS External implement and useful when the true mance, because most Web sites don’t
JavaScript can be added to a page as an URL is too long or complicated for users take advantage of the advanced features
inline script: to remember, or if a URL has changed. of ETags, relying instead on the Last-
<script type=”text/javascript”> The downside is that redirects insert Modified date as the means of valida-
var foo=”bar”; an extra HTTP roundtrip between the tion. By default, ETags are enabled in
</script> user and her content. In many cases, popular Web servers (including Apache
or as an external script: redirects can be avoided with some ad- and IIS). If your Web site doesn’t utilize
<scriptsrc=”foo.js”type=”text/ ditional work. If a redirect is truly nec- ETags, it’s best to turn them off in your
javascript”></script> essary, make sure to issue it with a far Web server. In Apache, this is done by
Similarly, CSS is included as either future Expires header (see Rule 3), so simply adding “FileETag none” to your
an inline style block or an external that on future visits the user can avoid configuration file.
stylesheet. Which is better from a per- this delay. b
formance perspective? Rule 14: Make Ajax Cacheable
HTML documents typically are not Rule 12: Remove Duplicate Scripts Many popular Web sites are moving to
cached because their content is con- If an external script is included multi- Web 2.0 and have begun incorporating
stantly changing. JavaScript and CSS ple times in a page, the browser has to Ajax. Ajax requests involve fetching data
are less dynamic, often not changing for parse and execute the same code mul- that is often dynamic, personalized, or
weeks or months. Inlining JavaScript both. In the Web 1.0 world, this data is
and CSS results in the same bytes (that b Caching redirects is not supported in some served by the user going to a specified
haven’t changed) being downloaded browsers. URL and getting back an HTML docu-

dec e mb e r 2 0 0 8 | vo l. 51 | n o. 1 2 | c om m u n ic at ion s of t he acm 39


practice

table 2. Percentage of unused Javascript functions. ences, training classes, consulting, and
documentation. Even with the knowl-
edge in hand, it would still take hours
Web site Javascript size unused functions of loading pages in a packet sniffer and
http://www.aol.com/ 115K 70% reading HTML to identify the appropri-
http://www.ebay.com/ 183K 56% ate set of performance improvements. A
better alternative would be to codify this
http://www.facebook.com/ 1088K 81%
expertise in a tool that anyone could
http://www.google.com/search?q=flowers 15K 55% run, reducing the learning curve and in-
http://search.live.com/results.aspx?q=flowers 17K 76% creasing adoption of these performance
http://www.msn.com/ 131K √√69% best practices. This was the inspiration
http://www.myspace.com/ 297K 82%
for YSlow.
YSlow (http://developer.yahoo.com/
http://en.wikipedia.org/wiki/Flowers 114K 68%
yslow/) is a performance analysis tool
http://www.yahoo.com/ 321K 87% that answers the question posed in the
http://www.youtube.com/ 240% 82% introduction: “Why is this site so slow?”
average 252K 74% I created YSlow so that any Web devel-
oper could quickly and easily apply the
performance rules to their site, and
find out specifically what needed to be
ment. Because the HTML document’s dynamic variable to the Ajax URL. For improved. It runs inside Firefox as an
URL is fixed (bookmarked, linked to, example, an Ajax request for the user’s extension to Firebug (http://getfirebug.
and so on), it’s necessary to ensure the address book could include the time com/), the tool of choice for many Web
response is not cached by the browser. it was last edited as a parameter in the developers.
This is not the case for Ajax respons- URL, “&edit=1218050433.” As long as The screenshot in Figure 3 shows
es. The URL of the Ajax request is in- the user hasn’t edited their address Firefox with iGoogle loaded. Firebug
cluded inside the HTML document; it’s book, the previously cached Ajax re- is open in the lower portion of the win-
not bookmarked or linked to. Develop- sponse can continue to be used, making dow, with tabs for Console, HTML, CSS,
ers have the freedom to change the Ajax for a faster page. Script, DOM, and Net. When YSlow is in-
request’s URL when they generate the stalled, the YSlow tab is added. Clicking
page. This allows developers to make Performance analysis with yslow YSlow’s Performance button initiates
Ajax responses cacheable. If an updated Evangelizing these performance best an analysis of the page against the set of
version of the Ajax data is available, the practices is a challenge. I was able to rules, resulting in a weighted score for
cached version is avoided by adding a share this information through confer- the page.
As shown in Figure 3, YSlow explains
each rule’s results with details about
what to fix. Each rule in the YSlow screen
is a link to the companion Web site,
where additional information about the
rule is available.

the next Performance


challenge: Javascript
Web 2.0 promises a future where de-
velopers can build Web applications
that provide an experience similar to
desktop apps. Web 2.0 apps are built
using JavaScript, which presents signifi-
cant performance challenges because
JavaScript blocks downloads and ren-
dering in the browser. To build faster
Web 2.0 apps, developers should ad-
dress these performance issues using
the following guidelines:
˲ Split the initial payload
˲ Load scripts without blocking
˲ Don’t scatter scripts

split the initial Payload


figure 3: yslow. Web 2.0 apps involve just a single

40 com municatio ns o f th e acm | d ec em ber 2008 | vo l . 5 1 | n o. 1 2


practice

Figure 4. Stylesheet followed ˲˲ script in iframe ing. This seems like it would be a rare
by inline script in www.msn.com/. ˲˲ script DOM element problem, but it afflicts four of the top
˲˲ script defer ten sites in the U.S: eBay, MSN, MyS-
˲˲ document.write script tag pace, and Wikipedia.
stylesheet
stylesheet You can see these techniques illus-
stylesheet trated in Cuzillion (http://stevesoud- Life’s Too Short, Write Fast Code
stylesheet
ers.com/cuzillion/), but as an example At this point, I hope you’re hooked on
image
image let’s look at the script DOM element ap- building high-performance Web sites.
image proach: I’ve explained why fast sites are impor-
image
<script type=”text/javascript”> tant, where to focus your performance
var se = document.createElement efforts, specific best practices to follow
(‘script’); for making your site faster, and a tool
se.src = ‘http://anydomain. you can use to find out what to fix. But
page load. Instead of loading more com/foo.js’; what happens tomorrow, when you’re
pages for each action or piece of infor- document.getElementsByTagName back at work facing a long task list and
mation requested by the user, as was (‘head’)[0].appendChild(se); being pushed to add more features in-
done in Web 1.0, Web 2.0 apps use </script> stead of improving performance? It’s
Ajax to make HTTP requests behind A new DOM element is created that important to take a step back and see
the scenes and update the user inter- is a script. The src attribute is set to the how performance fits into the bigger
face appropriately. This means that URL of the script. Appending it to the picture.
some of the JavaScript that is down- head of the document causes the script Speed is a factor that can be used
loaded is not used immediately, but to be downloaded, parsed, and execut- for competitive advantage. Better fea-
instead is there to provide function- ed. When scripts are loaded this way, tures and a more appealing user inter-
ality that the user might need in the they don’t block the downloading and face are also distinguishing factors. It
future. The problem is that this sub- rendering of other content in the page. doesn’t have to be one or the other. The
set of JavaScript blocks other content point of sharing these performance
that is used immediately, delaying im- Don’t Scatter Inline Scripts best practices is so we can all build
mediate content for the sake of future These first two best practices about Web sites to be as fast as they possibly
functionality that may never be used. JavaScript performance have to do can—whether they’re barebones or
Table 2 shows that for the top 10 U.S. with external scripts. Inline scripts also feature rich.
Web sites, an average of 74% of the func- impact performance, occasionally in I tell developers “Life’s too short,
tionality downloaded is not used im- significant and unexpected ways. The write fast code!” This can be interpret-
mediately. To take advantage of this op- most important guideline with regard ed two ways. Writing code that executes
portunity, Web developers should split to inline scripts is to avoid a stylesheet quickly saves time for our users. For
their JavaScript payload into two scripts: followed by an inline script. large-scale Web sites, the savings add
the code that’s used immediately (~26%) Figure 4 shows some of the HTTP up to lifetimes of user activity. The oth-
and the code for additional function- traffic for http://www.msn.com/. We see er interpretation appeals to the sense
ality (~74%). The first script should be that four stylesheet requests are down- of pride we have in our work. Fast code
downloaded just as it is today, but given loaded in parallel, then there is a white is a badge of honor for developers.
its reduced size the initial page will load gap, after which four images are down- Performance must be a consider-
more quickly. The second script should loaded, also in parallel with each other. ation intrinsic to Web development. The
be lazy-loaded, which means that after But why aren’t all eight downloaded in performance best practices described
the initial page is completely rendered parallel?c here are proven to work. If you want to
this second script is downloaded dy- This page contains an inline script make your Web site faster, focus on the
namically, using one of the techniques after the fourth stylesheet. Moving frontend, run YSlow, and apply these
listed in the next section. this inline script to either above the rules. Who knows, fast might become
stylesheets or after the images would your site’s most popular feature.
Load Scripts without Blocking result in all eight requests taking place
As described in “Rule 6: Put Scripts at in parallel, cutting the overall down- References
the Bottom,” external scripts block the load time in half. Instead, the images 1. Penick, H. Harvey Penick’s Little Red Book: Lessons
and Teachings From A Lifetime In Golf. Simon and
download and rendering of other con- are blocked from downloading until Schuster, 1992.
tent in the page. This is true when the the inline script is executed, and the 2. Souders, S. High Performance Web Sites: Essential
Knowledge for Front-End Engineers. O’Reilly, 2006.
script is loaded in the typical way: inline script is blocked from executing
<scriptsrc=”foo.js”type=”text/ until the stylesheets finish download-
javascript”></script> Steve Souders (http://stevesouders.com) works at Google
on Web performance and open source initiatives. He is the
But there are several techniques for author of High Performance Web Sites and the creator of
downloading scripts that avoid this c Note that these requests are made on different YSlow, Cuzillion, and Hammerhead. He teaches at Stanford
hostnames and thus are not constrained by and is the co-founder of the Firebug Working Group.
blocking behavior: the two-connections-per-server restriction of
˲˲ XHR eval some browsers, as described in "Rule 1: Make
˲˲ XHR injection Fewer HTTP Requests." © 2008 ACM 0001-0782/08/1200 $5.00

dec e mb e r 2 0 0 8 | vo l. 51 | n o. 1 2 | c om m u n ic at ion s of t he acm 41

You might also like