The signs were out there: Be aware of a potential distributed denial of service (DDoS) attack.
Those warnings came out last Tuesday and sure enough, the next day, Github suffered from the largest DDoS attack ever disclosed to date, peaking at 1.3Tbps.
On Tuesday Akamai, Cloudflare and Arbor said they had seen spikes in a relatively rare form of reflection/amplification DDoS attack via Memcached servers. Each service provider warned this type of reflection attack had the potential to deliver far larger attacks.
Amplification attacks are generated when a server can end up sending out a larger response than the initial query. Reflection occurs when the requesting IP is spoofed. The result is multiple servers can end up tricked into sending large responses to a single target IP, rapidly overwhelming it with the volume sent.
Memcached servers are vulnerable to such an attack whenever they are left accessible from the Internet.
The purpose of Memcached servers is to cache frequently used data to improve internal access speeds. Its default service is via UDP. Because it can be easily compromised, the data it caches can end up configured by the attackers. The result is that small requests to the server can result in very large replies from the cache.
GitHub researchers described the attack after the fact.
“The attack originated from over a thousand different autonomous systems (ASNs) across tens of thousands of unique endpoints,” GitHub researchers said in their report
“Between 17:21 and 17:30 UTC on February 28th we identified and mitigated a significant volumetric DDoS attack,” GitHub researchers said. “The attack originated from over a thousand different autonomous systems (ASNs) across tens of thousands of unique endpoints. It was an amplification attack using the memcached-based approach described above that peaked at 1.35Tbps via 126.9 million packets per second.
“At 17:21 UTC our network monitoring system detected an anomaly in the ratio of ingress to egress traffic and notified the on-call engineer and others in our chat system.
“Given the increase in inbound transit bandwidth to over 100Gbps in one of our facilities, the decision was made to move traffic to Akamai, who could help provide additional edge network capacity. At 17:26 UTC the command was initiated via our ChatOps tooling to withdraw BGP announcements over transit providers and announce AS36459 exclusively over our links to Akamai. Routes reconverged in the next few minutes and access control lists mitigated the attack at their border. Monitoring of transit bandwidth levels and load balancer response codes indicated a full recovery at 17:30 UTC. At 17:34 UTC routes to internet exchanges were withdrawn as a follow-up to shift an additional 40Gbps away from our edge.”
GitHub is learning from the incident.
“Making GitHub’s edge infrastructure more resilient to current and future conditions of the internet and less dependent upon human involvement requires better automated intervention,” the researcher said. “We’re investigating the use of our monitoring infrastructure to automate enabling DDoS mitigation providers and will continue to measure our response times to incidents like this with a goal of reducing mean time to recovery (MTTR).
“We’re going to continue to expand our edge network and strive to identify and mitigate new attack vectors before they affect your workflow on GitHub.com.”