A new machine learning system is in development that could predict IP address hijack incidents in advance by tracing things back to the actual hijackers themselves.
By illuminating some of the common qualities of what they call “serial hijackers,” it is possible to train a system to be able to identify roughly 800 suspicious networks – and find some of them had been hijacking IP addresses for years, said researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Center for Applied Internet Data Analysis (CAIDA), based at the San Diego Supercomputer Center (SDSC) at UC San Diego.
“Network operators normally have to handle such incidents reactively and on a case-by-case basis, making it easy for cybercriminals to continue to thrive,” said lead author Cecilia Testart, a graduate student at CSAIL who will present the paper at the ACM Internet Measurement Conference in Amsterdam. “This is a key first step in being able to shed light on serial hijackers’ behavior and proactively defend against their attacks.”
The paper, in a collaboration between MIT CSAIL and CAIDA, was written by Testart and MIT Senior Research Scientist David Clark, along with MIT Postdoc Philipp Richter, and Data Scientist Alistair King and Research Scientist Alberto Dainotti, both with CAIDA.
IP hijackers exploit a key shortcoming in the Border Gateway Protocol (BGP), a routing mechanism that essentially allows different parts of the Internet to talk to each other. Through BGP, networks exchange routing information so data packets find their way to the correct destination.
In a BGP hijack, a malicious actor convinces nearby networks the best path to reach a specific IP address is through their network. That’s unfortunately not very hard to do, since BGP itself doesn’t have any security procedures for validating a message is actually coming from the place it says it’s coming from.
“It’s like a game of Telephone, where you know who your nearest neighbor is, but you don’t know the neighbor’s five or 10 nodes away,” Testart said.
In 1998, the U.S. Senate’s first-ever cybersecurity hearing included a team of hackers that claimed they could use IP hijacking to take down the Internet in under 30 minutes. “More than 20 years later, the lack of deployment of security mechanisms in BGP is still a serious concern,” said CAIDA’s Dainotti.
To better pinpoint serial attacks, the group first pulled data from several years’ worth of network operator mailing lists, as well as historical BGP data taken every five minutes from the global routing table. From that they observed particular qualities of malicious actors and then trained a machine learning model to automatically identify such behaviors.
The system flagged networks that had several key characteristics, particularly with respect to the nature of the specific blocks of IP addresses they use:
• Volatile changes in activity: Hijackers’ address blocks seem to disappear much faster than those of legitimate networks. The average duration of a flagged network’s prefix was under 50 days, compared to almost two years for legitimate networks.
• Multiple address blocks: Serial hijackers tend to advertise many more blocks of IP addresses, also known as “network prefixes.”
• IP addresses in multiple countries: Most networks don’t have foreign IP addresses. In contrast, for the networks that serial hijackers advertised they had, they were much more likely to be registered in different countries and continents.
Testart said one challenge in developing the system was events that look like IP hijacks can often be the result of human error, or otherwise legitimate issue. For example, a network operator might use BGP to defend against distributed denial-of-service (DDoS) attacks in which there’s huge amounts of traffic going to their network. Modifying the route is a legitimate way to shut down the attack, but it looks virtually identical to an actual hijack.
Because of this issue, the team often had to manually jump in to identify false positives, which accounted for roughly 20 percent of the cases identified by their classifier. Moving forward, the researchers are hopeful future iterations will require minimal human supervision and could eventually be deployed in production environments.
One implication of this work is network operators can take a step back and examine global Internet routing across years, rather than just focus on individual incidents.
As people increasingly rely on the Internet for critical transactions, Testart expects IP hijacking’s potential for damage to only get worse. But she’s also hopeful it could be made more difficult by new security measures. In particular, large backbone networks such as AT&T have recently announced the adoption of resource public key infrastructure (RPKI), a mechanism that uses cryptographic certificates to ensure a network announces only its legitimate IP addresses.
Click here to view the paper on the subject.