2015With site loading times so critical for conversion of visitors to purchases and its use by Google as a signal in their search rankings, understanding the exact performance of your site for users and the reasons for that performance can be very important to not just the popularity of your site, but also the revenue of your business.
If your business involves online retail or eCommerce, then you should be used to investing in areas such as A/B testing to determine what changes improve your conversion rates and you may be aware of the effect of performance on users in general, but from experience the amount of organisations who have a tight focus on performance when building their site is pretty small.
There are many tools that are available to test the performance of your site for users:
- Pingdom full page test
- Webpagetest
- Your Browser’s Developer panel – such as Google Chrome’s Developer tools (as in the post header image)
All of the above tools are very valuable to determine how your site is currently performing, but as like any metric, knowing the value is of very little value if you don’t spend the time to understand what it means and how to improve based on it. Below we shall walk through each of the common areas of the HAR and waterfall graph data, understanding what each point means, why it can commonly occur and how to improve upon it.
1) Stalled/Blocking
What: This state indicates the time the browser has spent waiting to send a request. It may also include time spent on Proxy negotiation on the path to the server (see towards the end of this article for more specific information).
Why: The most common reason for this state is that the browser has exhausted the number of concurrent connections it is permitted to establish. In most browsers, this is set to 6 connections per hostname, however this does vary. With modern web pages requiring around 100 resources it is easy to see how these limits are reached quickly.
How to resolve: One of the most commonly used work around for this is a concept called Domain Sharding, however newer protocols such as SPDY and it’s newer replacement HTTP/2 mean that this concept is actually likely to degrade performance. With newer connection multiplexing available in HTTP/2 the recommendation is to avoid domain sharding with HTTP/2 simple recommendation to improve the performance of your site would be to reduce the number of resources that need to be loaded. This can either be done by removing content from the user view or working on the application side to consolidate CSS and JS files into one resource (for HTTP/1.1 as the benefits of this are negated in HTTP/2), consolidating images into CSS Image Sprites or replacing them with CSS3 designs.
2) DNS
What: This is the time the browser and/or OS spent on the DNS lookup for a hostname.
Why: Each new connection requires a DNS resolution process to be completed. If the user has not made a request to the hostname, then this lookup will need to be completed externally, requiring the DNS lookup to be completed to the user’s DNS resolver. That may in-turn then need to request further until the hostname’s authoritative DNS nameserver. This process can take hundreds of milliseconds to complete.
How to resolve: If the user, or any of the resolvers on the DNS lookup route have already recently completed resolution (within the hostname’s TTL period) then this can be accelerated. This gives two immediate ways to improve the situation:
- reduce the number of hostnames required (i.e. do not use domain sharding as discussed previously) and/or
- increase the TTL value of your hostnames, thus reducing the frequency at which resolution has to occur.
Additional improvements could be made by changing your DNS name servers to a better performing provider. Comparisons show that there can be a large range of over 100ms between the top 25 providers.
One more recent suggestion to reduce the impact of DNS lookups is to request browsers complete a DNS lookup whilst the user is still browsing the page, prior to clicking any link to the external hostname. This is completed via the use of browser prefetch hints. Although this may result in additional bandwidth and connection usage, these requests are made whilst the user is reading the page or viewing media and is therefore out of the critical path for page loads, thus reducing the page load time of any future requests where this enabled.
3) Connection
What: This is the amount of time that the browser spends establishing a connection to the host server, including the three way handshake required. This may also include the time spent on establishing a SSL/TLS connection if applicable.
Why: Common reasons for a long time spent on an individual connection would be high latency to the server or a busy server and/or network. If your individual connections are not overly slow, but you have a lot of them required for your page load, then this may be caused by having domain sharding or loading multiple resources from external sources.
How to resolve: The most efficient way to reduce the amount of time spent on connections is simply to avoid the need to make the connection to begin with. This could be by removing the requirement to load a resource by consolidating CSS files into one main file (again, exercise caution with HTTP2 connections mitigating the benefits), by having content cached on the client side browser or could also be via the use of other mitigations, such as enabling HTTP Persistent Connections (often called Keep-Alives).
By using HTTP Persistent connections, multiple resource requests can be made over the same connection via the use of HTTP Pipelining.
This avoids the need for multiple handshakes to be completed and is especially beneficial when using HTTPS connections due to the additional overheads of using TLS connections.
HTTP Persistent connections have also been developed upon further in newer protocols such as SPDY and HTTP/2 which is the IETF Standard (RFC7540) and was based on SPDY. This allows the use of request multiplexing, allowing multiple resources to be returned in response to one request if the server knows that for a browser to form a page it will need more than the one resource it has requested using a concept termed “Server Push”.
In newer browsers, you can also use pre-connect hints to instruct a browser to initialise a connection to a host before the final confirmation of the resources and request to be made. This reduces the bottleneck of establishing the connection, however they should be used wisely to avoid creating too many connections.
All of the above suggestions provide mitigations to avoid the inherent impact of latency for most internet users, however by reducing the root cause of the issue (latency) you can also improve your user experience. This can be done by improving your user connection times to your servers and is most commonly achieved by adding a Content Distribution Network to your website.
3.1) SSL
What: This is the time spent by the browser to complete the SSL handshake required for the establishment of a secure connection. This time is often included within the connection time, so it is not additional, but with the push from industry experts and search engine ranking signals meaning more and more sites are adopting HTTPS, poorly optimised SSL connections are becoming more prevalent as a cause of significant impact to your site’s performance.
Why: “TLS has exactly one performance problem: it is not used widely enough” is the bold claim made by Ilya Grigorik, Web performance engineer at Google and co-chair of W3C Webperformance Working Group. In short, TLS itself does add an overhead with the additional handshakes required, however if the configuration is properly optimised and used in conjunction with new protocols such as SPDY and HTTP/2, it can actually lead to an overall performance increase.
How to resolve:
Make sure your TLS configuration is properly configured! A good place to start on this is to review the TLS High Performance Checklist which outlines areas such as Session Caching, early termination of connections and Certificate chain optimisations that can dramatically increase your site performance. It would also be highly recommended to check your configuration against tools such as Qualys’ SSL Labs.
In addition to the above tips, strongly consider the use of a CDN to reduce user latency, achieve easy optimisation of your ciphers and also very easily add in support for Next Generation protocols such as HTTP/2, all of which will be beneficial to your site’s performance.
4) Send
What: This is the time spent for the user to send the request and is generally a very short period of time.
Why: In order for the browser to request content, it needs to make a network request to the server.
How to resolve: It’s pretty hard to optimise this stage and anyway, for most sites the impact from this stage is going to be pretty minimal. The only real easy win would be to avoid the need to make the request in the first place.
5) Wait/TTFB
What: This is the “Time To First Byte” (TTFB) and encompasses the time it takes for the response to reach the server, the server to provide the response and that response to make it back to the user. It is very frequently used as a performance metric (even though there is widespread thought that TTFB should be eliminated as a key performance metric and should be instead replaced by other metrics, such as moving to use page speed as your key performance indicator). However this stage does allow a large amount of room for performance improvements.
Why: The time taken for this stage can be affected by many factors, with the prime considerations being latency, server response times and application response times.
How to resolve:
With the network latency becoming the major bottleneck in performance on modern networks and it forming the major part of this stage, reducing this will be greatly beneficial. The easiest way here is to introduce networking caching via a CDN or if your site is only accessed by users in one small geography, optimising your web hosting for that area.
The other major component in this stage is the performance of your application. Unlike the majority of other stages involved in this article, the raw performance of your backend application to handle the request coming to it and deciding what to return is critical for improved performance here. Server side caching for applications that rely on database backends (most commonly seen with CMS systems such as WordPress or eCommerce applications such as Magento) you can greatly improve performance by caching database queries and other resource intensive activities that do not need to be personalised or contain private information on the server side. This can be used in addition with other caching, such as on the network or the user’s device.
Additional optimisation of the application to ensure it makes use of the least amount of resources as possible will also allow your server to handle more requests before it becomes resource constrained and starts to impact performance or cause downtime due to critical failures of the application stack.
6) Receive
What: This stage covers the time it takes for the receipt of all remaining data required for the page load from the time of the first byte.
Why: The time in this stage can be considerable depending on the quantity and type of resources to be downloaded. With average page sizes now over 2MB and the increased use of rich media such as HD video on sites more and more commonly, this can be a prime area to address.
How to resolve: Reducing or optimising the amount of content needed for initial page loads can greatly reduce the amount of time spent loading a page. Key areas to focus on would be the optimisation of media, such as having reduced quality thumbnails, lazy loading images below the fold or hidden and only buffering a short period of video (or even better, none at all) until the user has interacted with the content. This is especially beneficial for users on mobile networks with low bandwidth.
Additionally caching content on the client side, such as CSS and javascript that is required on multiple pages can substantially improve performance on subsequent page loads.
Lastly optimising the network connection between the user and the content to be served (mostly commonly via a CDN) can further reduce the time spent at this stage.
7) Other
7.1) Proxy Negotiation
What: This stage covers the time that a user’s browser spends negotiating a connection with a network proxy server.
Why: These are commonly found in corporate environments, to allow organisations to enforce security standards on workplace connected devices.
How to resolve: Other than the suggestions above, there isn’t really anything you can do to speed up this stage as the device and proxy server will be fully outside your control. In most cases this stage would not cause significant impact to your site performance.