If you run a website serving static data and need a caching solution, AWS CloudFront is the go-to service for this. It works by providing multiple ‘edge’ locations, which are simply data centres located in geographical hot-spots around the world. Data is accessed from the nearest data centre. For example, when accessing your London-region hosted website through CloudFront in Australia a user will be redirected to their closest edge location. If there is already a cached version of the website, CloudFront will deliver this content without having to request it from the origin in London. Not only does this offer improved performance for your end-users, it also reduces load on the origin servers.
CloudFront also offers additional security. Among its many features, it only allows layer 7 traffic through, which protects origins from layer 4 vectors of attack. Additionally, it allows a Web Application Firewall (WAF) to be associated with the distribution, offering further layer 7 protection.
A layer 4 Distributed Denial of Service (DDoS) attack operates at transport layer of the TCP stack. A common method is to flood the site from multiple locations with SYN requests – part of the three-way SYN (client) -> SYN ACK (server) -> ACK (client) handshake – until all its resources are consumed waiting for the ACK packet from the client that never arrives. As CloudFront never passes layer 4 to the origin, the site is protected. (AWS Shield does go some way to detect and mitigate such attacks outside of CloudFront).
A layer 7 DoS attack operates at a specific application protocol level, such as HTTP. An HTTP attack may involve sending a large number of GET requests to the site, which would eventually become overwhelmed and unresponsive. CloudFront can potentially mitigate such attacks naturally by simply returning cached versions of the requested objects. It can also block malformed HTTP requests. The addition of a WAF to CloudFront would provide additional protection against various other attacks, such as SQL injection that aim to steal or corrupt data.
The distributed nature of CloudFront means that every edge location will have a different IP address range. These ranges frequently change as new edge locations are constantly added to the mix.
This means that origins need to permit access from a wide range of locations, which would require them to be open to all IP addresses at Security Group (SG) and Network Access Control List (NACL) levels. From security point of view, this poses an issue. While origin details are never revealed via CloudFront, there is still a possibility of finding them by randomly scanning IP addresses. With CloudFront bypassed, its security and performance benefits are too.
There are a number of ways to only allow CloudFront into your origins. For S3, it is possible to enable Origin Access Identity (OAI) at distribution level, which would only allow CloudFront traffic to S3 buckets. This can be configured when creating your distribution or distribution behaviour.
For Load balanced traffic using ELBs, while there is the usual AWS WAF and AWS Shield bundle that can be applied (Shield is deployed on all services automatically). Shield Advanced offers additional protection – at a price. However, none of these solutions would deny access to traffic outside of CloudFront.
A possible solution would be to apply custom headers at CloudFront level using Lambda@Edge and filter traffic to the ELB through a WAF, looking for these headers. However, headers can be spoofed, and it would add an overhead (however small) of having to tag each packet with the header.
The other common solution is to restrict traffic via a SG. However, keeping the SG updated would traditionally be an ongoing manual task and the maintainer would need some way to keep track of when IP address ranges for CloudFront change.
At Jisc we have come up with an automated solution to this problem. Luckily, AWS supply a ready-made SNS topic, which provides notifications when those IP addresses change. AWS also supply a list of all IP address ranges used by CloudFront in a JSON-formatted file.
The SNS topic, as well as triggering an email, can trigger a Lambda function. We have taken advantage of this and have created a Lambda function which downloads the updated JSON file, compares its list to the IP addresses in a SG and will automatically update the SG, if necessary. One of the challenges we faced is that a single SG, by default, has a limit of 50 rules. At the time of writing, there are 68 CloudFront IP ranges, which wouldn’t fit into a single group. One way to mitigate this is to have this limit raised by AWS, however, there is always a chance of the limit being breached as more CloudFront locations come online and the cycle would repeat.
A longer-term solution is to split the SG into two or more groups and maintain a list on each, through the Lambda function. Each SG has a tag identifying it as a CloudFront SG as well as the sequence/list number. This ensures that rules are split evenly across each SG. By using tags, it is not necessary to specify which SG needs to be updated with what rules as the Lambda can automatically identify them. Our single Lambda function can update several SGs attached to several ELBs across different VPCs in a single run.
So, there you have it. By utilising the SNS topic and a Lambda function, we have a totally automated way to keep your origins behind CloudFront secure.