Content Delivery Networks (CDNs) are often considered to be a solution to multiple problems, but it is very important to consider the problem(s) that you are looking to solve for when determining whether, when and where to use one.
CDN companies run multiple servers in multiple geographic locations with plentiful and high quality networking connections and compute resources. These can, depending on the company and their business model, run into the hundreds of thousands of servers, in hundreds of locations, with multiple Terabits (1,000 Gigabits) of connectivity. Typically CDNs aim to provide value to their customers in multiple ways, which depending on their feature focus may include a range of:
- Connection optimisation – Latency/distance to servers
- Bandwidth speeds for content to users
- Improved uptime of a website
- Security enhancements
- Infrastructure reduction
CDN implementations are typically achieved in one of two ways as detailed below.
By using a DNS CNAME record, users attempting to browse a website will be redirected to the CDN servers by DNS lookups. An example of this would be browsing to “www.redhat.com” which requires the user’s device to complete a DNS lookup for the “www.redhat.com”. The response to this from Redhat’s DNS name servers would return a CNAME response of “wildcard.redhat.com.edgekey.net”. This requires the user’s device to make an additional DNS lookup for the previous CNAME response, which as this URL is controlled by the CDN (in this case Akamai), the user would make a request to Akamai’s DNS servers to identify the server(s) to connect to. At this point in time, the CDN would return IP address that would provide the best service for the end user.
By using this method, websites can maintain control of their DNS records on their own name servers at the high level, which can allow use of multiple CDNs for different purposes. On the downside, using CNAMEs for implementations necessitates and additional DNS lookup each time, which can degrade the optimisations of using a CDN. This impact can be offset through the use of long TTL (Time to Live) values on the CNAME records point to the CDN service.
An alternative implementation option can be to use a CDN’s nameservers for all DNS responses for the website domain by setting the domain’s nameserver (NS) records to those provided by the CDN. An example of this would be browsing to “thehackernews.com” which requires the user’s device to complete a DNS lookup for the “thehackernews.com”. As the nameservers for the domain have been set to “pat.ns.cloudflare.com” and “todd.ns.cloudflare.com”, DNS queiries are sent directly to the DNS provider to determine the IP address to return to give the end user the best service.
A network routing methodology used profusely by CDN providers is Anycast. This allows the providers to announce the same IP ranges from multiple locations around the globe. By doing so, the routing choice for end users to connect to the closest data center is typically controlled by BGP routing and removes some of the overhead of CDN providers identifying the correct location to serve users from.
An additional benefit of using Anycast routing is the ability to mitigate DDoS attacks. On the modern wild west internet, networks and datacenters are often targeted by DDoS attacks which can cause service degradation. By using Anycast routing, CDN providers have the ability of absorbing attack traffic within multiple data centers instead of just a single location, thus increasing the amount of DDoS traffic that can be mitigated before causing a issues for their customers. Additionally if a DDoS attack or other incident causes as particular location to have service issues, the provider can easily withdraw the IP broadcast for that location, which means traffic will then automatically re-route to the next closest location without any need to update users.
Note that not all networks make use of this technology, but other may also do so whilst retaining other decision metrics as detailed below.
Server choice via either method
Using either of the above methods, the CDN provider makes a decision of which server(s) within the providers network that the end user should connect to. Typically multiple decisions will be made when making this decision:
- Network proximity – although geographic proximity to a CDN provider’s datacenter is of importance at a high level, at a lower level often the networking proximity to the provider is a more critical factor in the decision. This is because the length of time it takes to connect to two different data centers of equal distance from a user may be significantly different depending on the network route a user may take. An example of this may be for a user based in central USA who is equidistant from CDN provider locations on the East and West coast. Although the distance between the two is the same and therefore and the high level connections to either would be equal, the user’s network may by design route all traffic to the ISP’s connectivity centre on the East coast. Given this it would be optimal to direct the user’s requests to the East coast CDN location. Decisions on these metrics are typically completed using a combination of Geo-IP identification and network proximity analysis on a continuous basis, however some CDN providers have started adding support for edns-client-subnet as an extra indication (which providers the user’s IP subnet to the CDN provider, allowing optimisation based on the end user rather than their DNS server).
- CDN provider resource usage – in order to ensure the best service provision for end users, providers may choose to direct users to lower utilised locations within similar network proximity.
- Cost reasons – some customers may choose to only use a specific subset of providers edge locations and thus providers may route end users for specific customers to a location dependant on the price of each location.
When making these decisions, CDNs will typically make use of low TTL values in order to achieve maximum performance and agility, with values often being between 5 minutes and even as low as 20 seconds for their records.
Optimisation vs. Offloading
When implementing a CDN it is important to understand the reasons that it is being completed in order to understand what criteria need to met for success. One of the important criteria is to understand if the desire is to optimise the user experience or whether it is to offload content from the current website infrastructure.
When optimising the user experience, CDN providers with provisions optimised for users must be selected. Depending on the website purpose, it would be important to understand if the criticality is on Latency or Bandwidth optimisation.
Offloading conversely is often a reactive design decision to remove the burden of serving content to users from a providers web environment. This is typically seen during spikes in traffic that may be seen due to publicity or malicious behaviour (such as DDoS), however it can also be used as a pro-active design feature and can also consider optimisation at the same time.
If a website completes typically small transactions and/or where response times are critical (such as in the gaming, VOIP, video conferencing areas) then optimisation of connection times to the website from users would be of more importance when selecting and implementing a CDN.
For some websites which only have customers in one geographic region, it may be possible to find a CDN provider that is very well provisioned in the matching locations (this is especially relevant in regions such as APAC and especially China). Alternatively if there is a truly global audience for the site then a global CDN with more locations would be the preference in order to optimise this connection.
Additionally if latency is a concern for your application it is important to understand the areas where using a CDN can also add to the latency of the website:
- Additional DNS lookups – when using CNAME implementations, there is an additional DNS round trip that must be made to confirm the IP address for the provider DNS record. A simple way to reduce the impact of this would be to make use of Name Server implementations where possible, although some providers do not allow this. Additionally high TTL values should be used on the initial CNAME from the website domain to the provider record.
- Additional latency from reverse proxying – without a CDN, users connect directly to the web site server over the internet. When using a CDN however, due to the reverse proxying that is required, there are then two connection rounds that need to be completed. This becomes more burdensome if the website is making use of HTTPS connections or if the CDN cached content is regularly expiring with infrequent visits as each and every request to the CDN will require all content to be reloaded from the web site servers again. The impact of regularly expiring content can be partially mitigated by configuring the expiry time of content in the CDN appropriately, however be aware that CDN networks will often clear their caches as they need to, which means content may still be removed from cache earlier if it is not being used frequently.
For some use cases, ensuring plentiful high quality bandwidth is available to serve your requests is of the highest importance for your visitors. Examples may be for serving static assets for high traffic sites (such as Flash, Video or High Resolution imagery), for advertising media or for Video streaming.
Although some providers may claim to have large amounts of available bandwidth it is important to ensure that it is of high enough quality. Network providers typically make use of two different kinds of bandwidth: Transit and Peering.
- Transit bandwidth is paid for (typically by the Megabit per second) network connectivity, which is bought by website operators, web hosts and/or CDNs to allow network access. Depending on the company that transit is brought from can dramatically impact an end user’s experience. Main reasons for the variety in performance can be bandwidth contention causing high utilisation, long distance routing causing longer packet round trips and/or geographic location of the transit provider that is not congruent with your users. Typically transit bandwidth allows access to all users globally and performance may vary between providers and user locations.
- Peering bandwidth, which can also commonly be referred to as Settlement free bandwidth, allows two networks to connect to each other without any payments on either side (aside from the costs of the connection and required equipment). As there is no per GB or Mbps charge for connections, these are often utilised to reduce the cost of serving traffic to users. Peering networks typically only allow access to a small subset of global users and typically allows higher quality, predictable connectivity to a known network (however there are increasing examples of when this is not the case).
A suitable mix of both of the above would be typical for most CDN providers as it allows the company to offload high volumes of traffic via their multiple cheaper peering connections, whilst still retaining paid for connectivity for other users. It would be important however to complete testing and due-diligence for the network however to ensure that performance is as promised.
HD video and the impact of latency
The effects of latency on HD video can be considerable depending on the network connectivity between users and content servers and can be an additional problem over the availability of bandwidth. As basic underlying protocols for the internet require (at least for TCP) that packets sent must be acknowledged by the user before the server can send more, the further away a client is from the server can increase the length of time this process takes. As this time increases, the amount of data that can be sent between the user and the client becomes limited not by the available bandwidth, but by the latency of connections for acknowledgement. In addition to this, there is typically an increase in packet loss as connections traverse a longer distance (due to more hops and additional network providers on route) which can also mean more data has to be resent to users. The outcome of this is that the quality of video streams may become limited by the distance from users to content, rather than any limit on the application server or user bandwidth quantity.
When considering the above, there may not only be benefits on the networking infrastructure required within any environment being significantly reduced, the user experience may also be improved (or brought to an acceptable level depending on the scenario) when making use of a CDN for media distribution.
One of the benefits of using CDNs is that (relatively) resource intensive and latency sensitive SSL offloading can typically be completed on the edge servers, offloading the resource requirements from your application servers and also improving the user experience by handling the SSL handshake process over lower latency connections. Additionally, CDN servers are typically optimised for the fastest SSL connections for users and are also regularly updated to match latest recommendations for security, such as removing out of date RC4 ciphers and reacting to new vulnerabilities.
One factor to consider during this process is that decrypting at the edge could leave your content un-secure from the CDN edge to your application servers, putting your user’s information at risk and raising potential compliance issues. As most CDNs and sites will not have the luxury of having secured network connections from all edge locations to their application servers, secure connections should be enabled for connectivity from the edge to the application servers. In order to reduce the impact of SSL handshakes for connections, long lived connections with connection re-use should be used to allow multiple edge connections to use the same connection back to the application servers. Additionally, some providers have the ability to additionally protect origin application servers by providing authentication and verification of requests through to the origin. This functionality typically makes use of client certificate based or username/password authentication from edge servers to origin servers. This functionality allows applications to verify that requests have come from their CDN partner’s servers and therefore have not bypassed any edge security functionality.
Lastly if you are providing your certificate to your CDN partner to use, prudent measures should be taken to reduce any risks. Although there should be no concerns over your partner’s deliberate misuse of your certificate (and if you do have concerns you would need to seriously reconsider your choice of partner), security breaches are an unfortunately common occurrence and steps to reduce the impact of any such event should always be taken. One simple step in the process would be to only provide applicable certificates for use, such as if only securing one domain, not to provide your partner with a wildcard certificate. Remember to check that your certificate is appropriately licensed for use as CDN providers will typically use your certificate on hundreds of servers and it is important to ensure that you have the right to do so.
Infrastructure design considerations
As a final footnote, once your CDN has been chosen, make sure to consider the impact this may have on your application environment. Typically traffic to users will decrease significantly after usage, so there may be some cost savings that can be made, however you should also consider the impact of purging caches on your CDN as all purged content will start being requested again by the CDN. Make sure not to cause yourself any service issues by purging large volumes of resources all at one time.
Another consideration factor would be to ensure that any security or alerting devices are updated to be compatible with large volumes of data coming from isolated IP ranges. It may be possible to configure them to examine the real client IP (as passed by most CDN providers in customer HTTP header values) and adjust any reactions to traffic to ensure that you do not block your users due to the concentration of traffic to your CDN’s servers. It may additionally be possible to obtain a range of IP ranges your CDN may connect from and update your firewall rules as appropriate.
Image credit: Drdee