Front-Loading Web Performance

By David F. Carr  |  Posted 2007-10-26 Email Print this article Print
 
 
 
 
 
 
 

Losing its lead in Internet traffic and ad revenue along with its chance to win the search market, Yahoo aims to reinvent itself.  But with a slowing economy and downward forecasts for earnings, how bad are things for Yahoo? That could depend on Microsoft’s plans to buy the struggling Web portal.

Front-Loading Web Performance

In seven years at Yahoo, Steve Souders had focused most of his efforts on back-end engineering tasks—squeezing more performance out of a database and optimizing memory usage in C++ programs running on a server, for instance. So three years ago, when he was named chief performance yahoo and charged with improving the user experience for visitors to Yahoo Web sites, he expected it would mean doing more of the same.

But it has not worked out that way. In the course of investigating the elements of user experience, Souders found that the biggest impact came from the communications between the Web site and the user's browser. His findings form the core of the lessons he imparts in a new book, High Performance Web Sites (O'Reilly, September 2007).

Souders changed his mind after measuring each step in the process of downloading a home page, with the data charted over time. From the start of a new visitor's request for a page, he found, only five percent of the time is spent downloading the HTML (the HyperText Markup Language) document containing the text of the page. The other 95 percent of the time is spent downloading and parsing images, stylesheets and JavaScript.

And at a site like Yahoo, back-end engineering is primarily about how quickly the site can assemble HTML documents—personalizing pages, retrieving information from databases, merging news feeds and feeding the results into a Web page template. This brought Souders to the sobering conclusion that he'd spent most of his career at Yahoo worrying about five percent of the performance problem. "It turns out that 80 percent to 90 percent of end user response time is spent on the front end," he says. "So the greatest potential for improvement is on the front end."

With this understanding, he and a small team of engineers set to work figuring out how to improve the speed at which Yahoo delivers pages.

There's nothing new in recognizing the importance of response time to Web development—it's long been known that users forced to wait more than a few seconds are likely to become frustrated and depart for some other, faster site. The conventional advice includes limiting the number and file size of images on a page to make it load more quickly.

But Souders' performance team found some more subtle strategies, which they have codified into 14 rules (see box, next page). Most of the rules revolve around making sure the browser doesn't have to work too hard to load and display Web pages, particularly by making maximum use of the browser cache. That's where your browser gets rid of bits and pieces of Web pages you've viewed.

For example, if a visitor has been spending time at yahoo.com, the cache on his or her machine will contain a copy of the Yahoo logo. If the visitor leaves the Web site and comes back, the browser can display the cached copy of the logo very quickly, avoiding the need to retrieve a new copy from the Web server.

The No. 1 rule on Yahoo's list is to minimize HTTP requests. Every time a browser needs to request anotherimage or another JavaScript file to be included in the page, it must send a message to the server using the Web's Hypertext Transfer Protocol (HTTP) and wait for a response. Every request takes time and runs some risk that the connection will be interrupted and the download will have to start over.

So it would be simpler in some ways if every Web page were delivered as one big file rather than dozens of smaller files. On the other hand, if you can identify the files included in a Web page that only change infrequently and get the browser to cache them, you can cut down on the total amount of data transmitted.

For that reason, Yahoo advises you break out Cascading Style Sheets (CSS) and JavaScript code into separate files rather than embedding them into the HTML of a Web page. This way, if the same CSS or JavaScript is going to be used by multiple pages within a Web site, the browser can download it once and cache it.

However, Yahoo breaks this rule on its own home page and the opening pages of many sections of its sites because pulling in all the needed HTML, CSS and JavaScript code with one HTTP transaction still tends to make a better first impression.

Controlling Cache Another trick is to configure your Web server to transmit image, CSS and JavaScript files with an expiration header set far into the future. This tells the browser to retain the cached files indefinintely instead of just for a few hours or days. You're effectively telling the browser these files will never change and it doesn't have to keep checking for fresh versions.

Eventually, you'll probably change your logo, CSS font stylings and JavaScript. But because the browser identifies these components by Web address, you can force it to load a new version by simply changing the filename or directory.

By applying a few of these simple rules, Souders' team was able to make a big impact on a key section of the Yahoo site—its search results page. "Within a year, we were able to improve response time by 40 to 50 percent," Souders says.

(To grade your Web site against Yahoo's rules, download YSlow, a Yahoo-developed open source add-on to Firebug, another Firefox browser extension. Together, YSlow and Firebug provide a variety of tools for profiling and debugging Web sites and applications.) —D.F.C.

Yahoo's 14 Rules for Exceptional Performance 1 Make Fewer HTTP Requests 2 Use a Content Delivery Network 3 Add an Expires Header 4 Gzip (Compress) Components 5 Put Stylesheets at the Top 6 Put Scripts at the Bottom 7 Avoid CSS Expressions 8 Make JavaScript and CSS External 9 Reduce DNS Lookups 10 Minify JavaScript 11 Avoid Redirects 12 Remove Duplicate Scripts 13 Configure ETags 14 Make Ajax Cacheable (Details at developer.yahoo.com/performance/)



<12345678910>
 
 
 
 
David F. Carr David F. Carr is the Technology Editor for Baseline Magazine, a Ziff Davis publication focused on information technology and its management, with an emphasis on measurable, bottom-line results. He wrote two of Baseline's cover stories focused on the role of technology in disaster recovery, one focused on the response to the tsunami in Indonesia and another on the City of New Orleans after Hurricane Katrina.David has been the author or co-author of many Baseline Case Dissections on corporate technology successes and failures (such as the role of Kmart's inept supply chain implementation in its decline versus Wal-Mart or the successful use of technology to create new market opportunities for office furniture maker Herman Miller). He has also written about the FAA's halting attempts to modernize air traffic control, and in 2003 he traveled to Sierra Leone and Liberia to report on the role of technology in United Nations peacekeeping.David joined Baseline prior to the launch of the magazine in 2001 and helped define popular elements of the magazine such as Gotcha!, which offers cautionary tales about technology pitfalls and how to avoid them.
 
 
 
 
 
 

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters