Yahoos Web Performance Guru: 14 Tenets for Speeding Up Sites

In seven years at Yahoo, Steve Souders had focused most of his effort on back-end engineering tasks like squeezing more performance out of a database, or optimizing the memory usage in C++ programs running on a server. Three years ago, when he was named Chief Performance Yahoo and charged with improving the user experience for visitors to Yahoo Web sites, he expected it would mean doing more of the same.

But it has not worked out that way. In the course of investigating the elements of user experience, Souders found that the biggest impact came from the communications between the Web site and the user’s browser. His findings form the core of the lessons he imparts in a new book, "High Performance Web Sites" (O’Reilly, Sept. 2007).

Souders changed his mind about what matters in Web site performance after using packet sniffer software to measure each step in the process of a new visitor downloading a home page, with the data charted over time. From the start of a request for a new page, he found, only 5% of the time was spent downloading HTML?the HyperText Markup Language document containing the text of the page. The other 95% of time is spent downloading and parsing images, stylesheeets and JavaScript files.

And at a site like Yahoo, back-end engineering is primarily about how quickly the Web site can assemble HTML documents?personalizing pages, retrieving information from databases, merging news feeds, and feeding the results into a Web page template.

This brought Souders to the sobering conclusion that he had spent most of his career at Yahoo worrying about 5% of the performance problem. "It turns out that 80% to 90% of end user response time is spent on the front end," he says. "So the greatest potential for improvement is on the front end."

With this understanding, he and a small team of engineers set themselves the task of figuring out what they could do to improve the speed at which Yahoo delivers pages.

There’s nothing new about recognizing the importance of response time to Web development, since users who are forced to wait more than a few seconds are likely to become frustrated and depart for some other, faster site. The traditional advice includes things like limiting the number and file size of images on a page to make it load more quickly.

But Souders’ performance team found some more subtle strategies, which they have codified into 14 rules (see list, below).

Yahoo’s 14 Rules for Exceptional Performance

  1. Make Fewer HTTP Requests
  2. Use a Content Delivery Network
  3. Add an Expires Header
  4. Gzip Components
  5. Put Stylesheets at the Top
  6. Put Scripts at the Bottom
  7. Avoid CSS Expressions
  8. Make JavaScript and CSS External
  9. Reduce DNS Lookups
  10. Minify JavaScript
  11. Avoid Redirects
  12. Remove Duplicate Scripts
  13. Configure ETags
  14. Make Ajax Cacheable

Most of the rules revolve around making sure the browser doesn’t have to work too hard to load and display your Web pages, particularly by making maximum use of the browser cache. That’s where your browser stores bits and pieces of the Web pages someone downloads.

For example, if a visitor has been spending time at yahoo.com, his cache will contain a copy of the Yahoo logo. If he leaves the Web site and comes back, his browser can display the cached copy of the logo very quickly, avoiding the need to request a new copy and get it back from the Web server.

The No. 1 rule on Yahoo’s list is to minimize HTTP requests. Every time a browser needs to request another image, or another JavaScript file to be included in the page, it must send a message to the server using the Web’s Hypertext Transfer Protocol and wait for a response. Every one of those requests takes time and runs some risk that the connection will be interrupted and the download will have to start over.

So it would be simpler in some ways if every Web page was delivered as one big file, rather than dozens of smaller files. On the other hand, if you can identify the files included in a Web page that only change infrequently and get the browser to cache them, you can cut down on the total amount of data transmitted.

For that reason, Yahoo advises that Cascading Style Sheets (CSS) and JavaScript code be broken out into separate files rather than embedded into the HTML of a Web page. That way, if the same CSS or JavaScript is going to be used by multiple pages within a Web site, the browser can download it once and cache it.

However, Yahoo breaks this rule on its own home page and the opening pages of many sections of its Web sites because pulling in all the needed HTML, CSS, and JavaScript code with one HTTP transaction still tends to make a better first impression.

Next Page: Another Trick: Controlling the Cache