
I am an Assistant Professor at the Department of Computer Science at Yale University. My research interests span computer systems, networks, and security. My work addresses challenges in processing, storing, and serving large volumes of data to empower real-world systems: from sprawling internet services like social media to critical tools in health and medicine.
I am always looking for motivated graduate students and postdoctoral researchers!
[2026]
[2025]
[2024]
[2023]
[2022]
[2021]
[2020]
You can find a full list of my publications here.
Memory Disaggregation: Scaling resources at server granularity wastes resources when compute and memory needs are mismatched, increasing both costs and carbon emissions. Memory disaggregation separates compute and memory into shared network-connected pools to improve resource efficiency, capacity, and elasticity. We are rethinking the cloud software and hardware stack to realize memory disaggregation:
Secure cloud systems: As more privacy-sensitive applications move storage and computation to the cloud, encrypted data and secure enclaves can still leak sensitive information through access patterns. Existing defenses are often too expensive in bandwidth or storage to deploy widely. Our research studies these real-world access-pattern vulnerabilities and designs more efficient protections against them.
Systems for AI: Today’s AI serving systems waste substantial time and resources because they treat requests as independent and unpredictable, even though real workloads contain rich recurring structure in both arrival patterns and prompt content. We are building cloud AI serving platforms that treat workload structure as a first-class systems primitive:
Storage and processing stacks for automated data: Emerging applications that rely on automated data sources — ranging from smart vehicles to brain implants — require processing, storing, and serving massive volumes of semantically rich data. We are developing systems for efficient ingestion of data without compromising query and processing performance by exploiting properties specific to machine-generated data:
Serverless Systems: Serverless analytics workloads increasingly demand fine-grained, rapidly changing compute and memory resources, but existing cloud systems manage them too coarsely, forcing a tradeoff between performance and utilization under bursty, time-varying demand. We are building a serverless analytics stack that treats elasticity and workload-aware multiplexing as first-class primitives:
(Past) Queries on compressed data: As datasets grow beyond DRAM capacity, maintaining interactive query performance becomes difficult because spilling to slower secondary storage increases latency and lowers throughput. We developed systems that address this challenge using a fundamentally new approach: enabling rich query execution directly on compressed data, reducing the need to fully decompress or rely on large DRAM footprints.
Operating Systems:
Computer Networks:
Program Committees: