Inside Yahoo`s Identity Crisis - Front-Loading Web Performance (
Page 5 of 10 )
Front-Loading Web Performance
In seven years at Yahoo, Steve
Souders had focused most of his
efforts on back-end engineering
tasks—squeezing more performance
out of a database and optimizing
memory usage in C++ programs running
on a server, for instance. So
three years ago,
when he was named
chief performance
yahoo and charged
with improving the
user experience for
visitors to Yahoo Web
sites, he expected it
would mean doing
more of the same.
But it has not
worked out that
way. In the course of investigating the
elements of user experience, Souders
found that the biggest impact came
from the communications between
the Web site and the user's browser.
His findings form the core of the lessons
he imparts in a new book, High
Performance Web Sites (O'Reilly,
September 2007).
Souders changed his mind after
measuring each step in the process of
downloading a home page, with the
data charted over time. From the start
of a new visitor's request for a page,
he found, only five percent of the time
is spent downloading the HTML (the
HyperText Markup Language) document
containing the text of the page.
The other 95 percent of the time is
spent downloading and parsing images,
stylesheets and JavaScript.
And at a site like Yahoo, back-end
engineering is primarily about how
quickly the site can assemble HTML
documents—personalizing pages,
retrieving information from databases,
merging news feeds and feeding the
results into a Web page template.
This brought Souders to the
sobering conclusion that he'd spent
most of his career at Yahoo worrying
about five percent of the performance
problem. "It turns out that 80 percent
to 90 percent of end user response
time is spent on the front end," he
says. "So the greatest potential for
improvement is on the front end."
With this understanding, he and a
small team of engineers set to work
figuring out how to improve the speed
at which Yahoo delivers pages.
There's nothing new in recognizing
the importance of response
time to Web development—it's long
been known that users forced to wait
more than a few seconds are likely
to become frustrated and depart for
some other, faster site. The conventional
advice includes limiting the
number and file size of images on a
page to make it load more quickly.
But Souders' performance team
found some more subtle strategies,
which they have codified into 14
rules (see box, next page). Most of
the rules revolve around making sure
the browser doesn't have to work too
hard to load and display Web pages,
particularly by making maximum use
of the browser cache. That's where
your browser gets rid of bits and
pieces of Web pages you've viewed.
For example, if a visitor has been
spending time at yahoo.com, the
cache on his or her machine will contain
a copy of the Yahoo logo. If the
visitor leaves the Web site and comes
back, the browser can display the
cached copy of the logo very quickly,
avoiding the need to retrieve a new
copy from the Web server.
The No. 1 rule on Yahoo's list is to
minimize HTTP requests. Every time
a browser needs to request anotherimage or another JavaScript file to be
included in the page, it must send a
message to the server using the Web's
Hypertext Transfer Protocol (HTTP)
and wait for a response. Every request
takes time and runs some risk that the
connection will be interrupted and the
download will have to start over.
So it would be simpler in some
ways if every Web page were delivered
as one big file rather than dozens
of smaller files. On the other hand, if
you can identify the files included in
a Web page that only change infrequently
and get the browser to cache
them, you can cut down on the total
amount of data transmitted.
For that reason, Yahoo advises
you break out Cascading Style Sheets
(CSS) and JavaScript code into separate
files rather than embedding them
into the HTML of a Web page. This
way, if the same CSS or JavaScript is
going to be used by multiple pages
within a Web site, the browser can
download it once and cache it.
However, Yahoo breaks this rule
on its own home page and the
opening pages of many sections of its
sites because pulling in all the needed
HTML, CSS and JavaScript code with
one HTTP transaction still tends to
make a better first impression.
Controlling Cache
Another trick is to configure your
Web server to transmit image, CSS
and JavaScript files with an expiration
header set far into the future. This
tells the browser to retain the cached
files indefinintely instead of just for a
few hours or days. You're effectively
telling the browser these files will
never change and it doesn't have to
keep checking for fresh versions.
Eventually, you'll probably change
your logo, CSS font stylings and
JavaScript. But because the browser
identifies these components by Web
address, you can force it to load a
new version by simply changing the
filename or directory.
By applying a few of these simple
rules, Souders' team was able to
make a big impact on a key section
of the Yahoo site—its search results
page. "Within a year, we were able to
improve response time by 40 to 50
percent," Souders says.
(To grade your Web site against
Yahoo's rules, download YSlow, a
Yahoo-developed open source add-on
to Firebug, another Firefox browser
extension. Together, YSlow and
Firebug provide a variety of tools for
profiling and debugging Web sites and
applications.) —D.F.C.
Yahoo's 14 Rules for Exceptional Performance
1 Make Fewer HTTP Requests
2 Use a Content Delivery Network
3 Add an Expires Header
4 Gzip (Compress) Components
5 Put Stylesheets at the Top
6 Put Scripts at the Bottom
7 Avoid CSS Expressions
8 Make JavaScript and CSS
External
9 Reduce DNS Lookups
10 Minify JavaScript
11 Avoid Redirects
12 Remove Duplicate Scripts
13 Configure ETags
14 Make Ajax Cacheable
(Details at developer.yahoo.com/performance/)