Monday, May 26, 2008

Performance Research, Part 2: Browser Cache Usage - Exposed!

This is the second in a series of articles describing experiments conducted to learn more about optimizing web page performance. You may be wondering why you’re reading a performance article on the http://web-performance-research.blogspot.com. It turns out that most of web page performance is affected by front-end engineering, that is, the user interface design and development.

In an earlier post, I described What the 80/20 Rule Tells Us about Reducing HTTP Requests. Since browsers spend 80% of the time fetching external components including scripts, stylesheets and images, reducing the number of HTTP requests has the biggest impact on reducing response time. But shouldn’t everything be saved in the browser’s cache anyway?

Why does cache matter?

It’s important to differentiate between end user experiences for an empty versus a full cache page view. An “empty cache” means the browser bypasses the disk cache and has to request all the components to load the page. A “full cache” means all (or at least most) of the components are found in the disk cache and the corresponding HTTP requests are avoided.

The main reason for an empty cache page view is because the user is visiting the page for the first time and the browser has to download all the components to load the page. Other reasons include:

  • The user visited the page previously but cleared the browser cache.
  • The browser cache was automatically cleared, based on the browser’s settings.
  • The user reloaded the page in a way that caused the cache to be bypassed. For example, the browser will bypass the cache if you hold down the control-shift key while clicking the Refresh button in Internet Explorer.

Strategies such as combining scripts, stylesheets, or images reduce the number of HTTP requests for both an empty and a full cache page view. Configuring components to have an Expires header with a date in the future reduces the number of HTTP requests for only the full cache page view.

Previously, we observed where the time is spent when a user requests www.example.com with an empty cache. When a user loads the page, the browser downloads approximately 30 components (see Figure 1). Figure 2 is a graphical view of where the time is spent loading http://www.example.com with a full cache. Each bar represents a specific component requested by the browser. Since components are already in the cache on a full cache page view, and the Expires header has a date in the future, the browser only has to download three components including the HTML document

Figure 1. Loading http://www.yahoo.com with an empty cache

Figure 1. Loading http://www.example.com with an empty cache

Figure 2. Loading http://www.yahoo.com with a full cache

Figure 2. Loading http://www.example.com with a full cache

Table 1 shows a summary of the total size and number of requests for each type of component to load http://www.example.com. How much does a full cache benefit the user? Loading the page over my cable modem at home, it took 2.4 seconds with an empty cache and only 0.9 seconds with a full cache. The full cache page view had 90% fewer HTTP requests and 83% fewer bytes to download than the empty cache page view.

Table 1. Empty and Full Cache Summary to load http://www.example.com

Table 1. Empty and Full Cache Summary to load http://www.yahoo.com

* Times were measured over cable modem (~2.5 mbps).

No comments: