Anurag Khandelwal

I am an Assistant Professor at the Department of Computer Science at Yale University. My research interests span computer systems, networks, and security. My work addresses challenges in processing, storing, and serving large volumes of data to empower real-world systems: from sprawling internet services like social media to critical tools in health and medicine.

I am always looking for motivated graduate students and postdoctoral researchers!

[2026]

CORD selected for inclusion in IEEE Micro’s Top Picks in Computer Architecture in 2025!
BulletTime accepted to ISCA’26!
Soul accepted to OSDI’26!
TimelyLLM accepted to MobiSys’26!
CounterPoint accepted to ASPLOS’26, wins Best Paper Award!

[2025]

NSF Award to fortify cloud confidential computing environments (with Lin, Seung-seob)! Thanks NSF!
Spirit and Mage accepted to SOSP’25!
Found In Translation accepted to USENIX Security’25!
Weave accepted to OSDI’25!
CORD accepted to ISCA’25, wins Distinguished Artifact Award!
PULSE accepted to ASPLOS’25!

Memory Disaggregation: Scaling resources at server granularity wastes resources when compute and memory needs are mismatched, increasing both costs and carbon emissions. Memory disaggregation separates compute and memory into shared network-connected pools to improve resource efficiency, capacity, and elasticity. We are rethinking the cloud software and hardware stack to realize memory disaggregation:

Scalable cache coherence for disaggregated shared memory pools: SOSP’21, ISCA’25, OSDI’26
Fair sharing across disaggregated memory resources: OSDI’23, SOSP’25
Low-latency, high-throughput access to disaggregated memory: ASPLOS’25, SOSP’25
Understanding memory performance: ISCA’26, ASPLOS’26

Secure cloud systems: As more privacy-sensitive applications move storage and computation to the cloud, encrypted data and secure enclaves can still leak sensitive information through access patterns. Existing defenses are often too expensive in bandwidth or storage to deploy widely. Our research studies these real-world access-pattern vulnerabilities and designs more efficient protections against them.

Protecting against access pattern vulnerabilities: USENIX Sec’20, OSDI’22, OSDI’25
Understanding access pattern vulnerabilities: USENIX Sec’24, USENIX Sec’25

Systems for AI: Today’s AI serving systems waste substantial time and resources because they treat requests as independent and unpredictable, even though real workloads contain rich recurring structure in both arrival patterns and prompt content. We are building cloud AI serving platforms that treat workload structure as a first-class systems primitive:

Scheduling for low-latency, high-throughput AI inference: NSDI’23, MobiSys’26
Caching attention state across prompts for low-latency inference: MLSys’24

Storage and processing stacks for automated data: Emerging applications that rely on automated data sources — ranging from smart vehicles to brain implants — require processing, storing, and serving massive volumes of semantically rich data. We are developing systems for efficient ingestion of data without compromising query and processing performance by exploiting properties specific to machine-generated data:

High-throughput compressed storage high-dimensional data: EuroSys’24
Distributed system for scalable Brain-Computer Interfacing (BCI): ISCA’23, MICRO Top Picks’23
Distributed monitoring & diagnosis for high speed networks: NSDI’19

Serverless Systems: Serverless analytics workloads increasingly demand fine-grained, rapidly changing compute and memory resources, but existing cloud systems manage them too coarsely, forcing a tradeoff between performance and utilization under bursty, time-varying demand. We are building a serverless analytics stack that treats elasticity and workload-aware multiplexing as first-class primitives:

Position papers: UC Berkeley Tech Report, SIGMOD’20, CACM’21
Enabling fast and cost-effective analytics over serverless functions: NSDI’21, EuroSys’22

(Past) Queries on compressed data: As datasets grow beyond DRAM capacity, maintaining interactive query performance becomes difficult because spilling to slower secondary storage increases latency and lowers throughput. We developed systems that address this challenge using a fundamentally new approach: enabling rich query execution directly on compressed data, reducing the need to fully decompress or rely on large DRAM footprints.

Enabling queries on compressed data: NSDI’15, SIGMOD’17, Thesis
Dynamic storage-performance tradeoff in data stores: NSDI’16

Operating Systems:

Spring 2024, Spring 2025, Spring 2026

Computer Networks:

Spring 2020, Spring 2021, Spring 2022

Big Data Systems:

Fall 2020, Fall 2021. Fall 2023, Fall 2025

Program Committees:

2026: EuroSys, SOSP
2025: NSDI
2024: OSDI, SOSP, EuroSys
2023: CoNEXT (Poster Co-Chair), NSDI, EuroSys
2022: NSDI, HotNets, EuroSys
2021: JSys, NSDI
2020: SIGCOMM (Poster/Demo, SRC), ASPLOS (EPC), NSDI

AnuragKhandelwal

About

Recent News

Research

Teaching

Service