Anurag Khandelwal Wins NSF CAREER Award

04/26/2021
Departments: Computer Science

For a proposal to develop technology that would make computing systems run more efficiently, Anurag Khandelwal has won a 2021 Faculty Early Career Development (CAREER) Award from the National Science Foundation (NSF).

Khandelwal, assistant professor of computer science, will use the $626,647, five-year grant to create a computing system that would improve memory utilization, reduce energy consumption and decrease the costs of the computing infrastructure. The NSF CAREER award is a prestigious honor for young faculty members and supports the early career activities of teachers and scholars who are most likely to become the academic leaders of the future.

Data centers, which Khandelwal calls “the factories of the digital age,” can consume as much power as a city of two million people, and in total consume two percent of the world’s electricity. Larger data centers can comprise over a million servers, each of which house CPUs, memory and storage. Memory in particular, can consume as much as 46% of an average system’s energy even though memory usage in today’s data centers can be as low as 20−30%. One significant reason for this is poor utilization of memory across various data center applications.

“For instance, if I’m trying to use these servers for an application that needs, say, one unit of CPU and 10 units of memory, and there's another application that uses one unit of memory but 10 units of CPU - if your servers have 10 units of both, you’re wasting the nine units that are remaining,” he said.

Recent proposals have tried to solve the problem with memory disaggregation, which physically separates memory and CPUs and connects them via the network. This approach not only promises better memory and CPU utilization, significantly improving data center energy efficiency, but also offers a number of additional benefits.

“This is a great abstract idea, but making it usable for data center applications is the biggest challenge,” he said. That’s because you need an operating system that not only manages all of these resources efficiently, but in a way that’s familiar to the user. 

Khandelwal’s project focuses on in-network memory management for the operating systems.

“Our proposal is to place it in the network,” he said. “The network is the central hub for all of these resources at this point. The management operations that are present in an operating system - things like memory management and compute management - are best handled in the network.”  

The project makes use of recent advances in hardware technology.  

“Recent hardware trends have made it possible to have programmable networks,” he said. “In days past, networks were very simple devices - they would take data items and forward them to the right entity. But now there’s a lot of interesting hardware innovations that have made it possible to program a network as well. We’re planning to leverage this so that we can program the network to provide an operating system abstraction that is both efficient and can hide the complexities of planning arbitrary applications on disaggregated computers.”

The project has an outreach and curriculum development component as well, in which Khandelwal’s lab will collaborate with Yale’s Pathways to Science program. Together they will develop a program aimed at broadening the participation of underrepresented groups in computer science and educating graduate, undergraduate and high-school students. Part of the effort will involve workshops this summer on cloud computing and data center architectures.