Many people have a rough idea that latency has something to do with the delay in moving content from the host server to the user, but when pressed, they struggle with explaining the real-world implications of latency on application performance. In this post, I’m going to explain what latency is, its impact on page load, and how we can fight back.
In web performance circles, latency is the amount of time it takes for the host server to receive and process a request for a page object. The amount of latency depends largely on how far away the user is from the server.
Putting latency in real-world terms
When you also consider that a page can have upwards of 300 or 400 objects, and that latency can reach a full second for some mobile users, you can easily see where latency becomes a major problem. If your goal is to have your entire page load in less than 3 seconds (and if that’s not your goal, it should be), then latency can kill you right out of the gate.
Latency and third-party content
A couple of years ago, Julia Lee, senior director of engineering for Yahoo! Mail, shared findings that 73% of their total latency was due to third-party ads. Not only did ad latency make up almost three-quarters of overall latency, but the amount of latency had increased by 500% over the course of several years. In earlier days, before redirects, the average ad experienced about 464 milliseconds of latency (which is already pretty poor). Over time, that number grew to a staggering 2.7 seconds.
So, how do you fight back?
For obvious reasons, tackling latency is a top priority for the performance industry. There are several ways to do this:
- Allow more requests to happen concurrently.
- Shorten the server round trips by bringing content closer to users.
- Reduce the number of round trips.
- Improve the browser cache, so that it can (1) store files and serve them where relevant on subsequent pages in a visit and (2) store and serve files for repeat visits.
Browser vendors work around latency by using multiple connections, which allows the browser to make simultaneous requests to the host server. Since 2008, most browsers have finally moved from two connections per domain to six. Vendors also focus on improving the browser cache.
Google’s SPDY protocol extends what the browser can do by adding a session layer atop of SSL that allows for multiple concurrent streams over a single connection.
Content delivery networks (CDNs) cache content in distributed servers across a region or worldwide, thereby bringing content closer to users and reducing round trip time (RTT). Important to note: While CDNs help with desktop performance, it’s more difficult to measure their impact on mobile latency.
Front-end optimization (FEO) — either performed manually by developers or implemented via an automated solution like our FastView solution — alleviates latency in several ways, such as:
- Consolidating page objects into bundles. Fewer bundles means fewer trips to the server, so the total latency hit is greatly reduced. For example, a page that starts with 63 objects could see those objects consolidated into 9 resource requests.
- Leveraging the browser cache, allowing it to do a better job of storing files and serving them again where relevant, so that the browser doesn’t have to make repeat calls to the server.
- Minifying code. A page’s source code can contain a lot of unnecessary characters (spaces, new line characters, and comments) and these can consume bandwidth and cause additional latency. Minification, which is usually applied only to scripts and stylesheets, eliminates these characters, typically reducing filesize by about 20 percent.
Learn more: Radware FastView