A novel approach can allow the massive infrastructure powering cloud computing to run as much as 15 to 20 percent more efficiently.
To show it, Google has already applied this model, said computer scientists at the University of California, San Diego, and Google.
Computer scientists looked at a range of Google web services, including Gmail and search. They used a unique approach to develop their model. Their first step was to gather live data from Google’s warehouse-scale computers as they were running in real time. Their second step was to conduct experiments with data in a controlled environment on an isolated server. The two-step approach was key, said Lingjia Tang and Jason Mars, faculty members in the Department of Computer Science and Engineering at the Jacobs School of Engineering at UC San Diego.
“These problems can seem easy to solve when looking at just one server,” said Mars. “But solutions do not scale up when you’re looking at hundreds of thousands of servers.”
The work is one example of the research Mars and Tang are pursuing at the Clarity Lab at the Jacobs School, their newly formed research group. Clarity is an acronym for Cross-Layer Architecture and Runtimes.
“If we can bridge the current gap between hardware designs and the software stack and access this huge potential, it could improve the efficiency of web service companies and significantly reduce the energy footprint of these massive-scale data centers,” Tang said.
Researchers sampled 65 K of data every day over a three-month span on one of Google’s clusters of servers, which was running Gmail. When they analyzed that data, they found the application was running significantly better when it accessed data located nearby on the server, rather than in remote locations. But they also knew the data they gathered was noisy because of other processes and applications running on the servers at the same time. They used statistical tools to cut through the noise. But they had to do more experiments.
Next, computer scientists went on to test their findings on one isolated server, where they could control the conditions in which the applications were running. During those experiments, they found data location was important, but competition for shared resources within a server, especially caches, also played a role.
“Where your data is versus where your apps are matters a lot,” Mars said. “But it’s not the only factor.” Servers come equipped with multiple processors, which in turn can have multiple cores. Random-access memory is on to each processor, allowing quick access to data regardless of where it is. However, if an application running on a certain core is trying to access data from another core, the application is going to run more slowly. And this is where the researchers’ model comes in.
“It’s an issue of distance between execution and data,” Tang said. Based on these results, computer scientists developed a novel metric, called the NUMA score, which can determine how well random-access memory ends up allocated in warehouse-scale computers. Optimizing the NUMA score can lead to 15 to 20 percent improvements in efficiency. Improvements in the use of shared resources could yield even bigger gains—a line of research Mars and Tang are pursuing in other work.