On Monday 12/1/2014 our DNS provider (DNSimple) was the target of a DDoS attack that created widespread outages on their entire infrastructure from early afternoon to late in the evening. We like to aim for a certain level of transparency so we thought we would share the details of what happened how we implemented changes to mitigate the chances of it happening again.
Events like this can happen at the worst time possible and of course it had to happen while we were busy autographing the new Limited Edition Tornado Chasers Blu-ray. In fact, we didn’t even realize it until we took a break to post some pictures of the signing on Facebook and Twitter. It was also Cyber Monday and it’s likely that some other website using DNSimple was the target but everyone using their service was impacted.
With DNSimple’s website down we couldn’t even login to export our zone file and temporarily migrate to a new provider. Of course having an offline copy of our zone file would have helped a lot but this made us realize there was still a big hole in our uptime strategy. Before continuing, it should be noted that the outage does not reflect negatively against DNSimple, they are a good DNS provider operated by good people and what happened to them could happen to any provider.
In fact DNS is a service that is designed with redundancy in mind, even with a single provider you specify multiple NS records on your domain and DNS clients will retry these in order if one is to fail until one succeeds or if too many fail it gives up. Each NS record likely backed by many servers and being that each of these NS records usually point to different datacenters in different regions (often spread out over different continents), you tend to think of DNS as being something that is already very resilient. However, in this case the entire provider was the target of a coordinated attack on each of their points of service in each datacenter.
With warnings and the live storm chasing platform we’ve always sought to maintain a very high level of uptime, with lots of redundancy and a platform designed to scale. So to protect against these kind of events we’ve now added an extra layer of redundancy at the DNS level by adding a second DNS provider (DNS Made Easy). We ended up with these two providers because they both support a custom record type (ALIAS and ANAME) that allows us to map our A record to CNAME, which can normally only be assigned to an IP address. This is because our web application runs on a high-availability infrastructure where the IP address can often change and we don’t want to run our website under the “www” CNAME because it doesn’t look cool. 😉
Hopefully this post can help other people realize they can not rely on a single DNS provider alone. You have to either use a combination of multiple providers or a combination of a provider and self hosted DNS. DNS is a service that is already relatively inexpensive so paying for two providers is definitely a good investment.