Sorav Bansal

Sorav Bansal's Homepage

Microsoft Chair Professor,
Department of Computer Science,
IIT Delhi.
sbansal at iitd.ac.in
Ph: +91 (11) 2659 6020

Running a research-based startup called CompilerAI
CompilerAI is a team of PhDs in formal methods and/or compilers.
We are looking to recruit people with prior research experience in these areas.

Brief Bio

Sorav Bansal is a Microsoft Chair Professor at the CS department at IIT Delhi, and works at the intersection of formal methods, compiler design, and AI algorithms. He is also involved in a startup that aims to commercialize the end-to-end translation validator developed as a part of the research done in his group at IIT Delhi. Sorav obtained his B.Tech. from IIT Delhi, and Ph.D. from Stanford University.

Summary

Representative Publications

(see full publication list at
DBLP)

This talk video (slides) summarizes the high-level motivation and results of our primary line of research on superoptimization-based compiler design.

[ OOPSLA 2024 ] Modeling Dynamic (De)Allocations of Local Memory for Translation Validation
More details in our technical report.
[ OOPSLA 2020 ] Counterexample-Guided Correlation Algorithm for Translation Validation [ 4-min video, 15-min video ]
More details in Shubhani's PhD thesis.
[ PLDI 2020 ] OOElala: Order-Of-Evaluation based Alias Analysis for compiler optimization [ Short Video, Fun Video, Full Video ]
[ ASPLOS 2019 ] HawkEye: Efficient Fine-grained OS Support for Huge Pages [ Short Video ]
[ APLAS 2017 ] Black-box equivalence checking across compiler optimizations
More details in Manjeet Dahiya's PhD thesis.
[ SOSP 2013 ] Fast Dynamic Binary Translation for the Kernel [ talk video ]
This was the first SOSP paper ever from India.
[ OSDI 2008 ] Binary Translation Using Peephole Superoptimizers
[ ASPLOS 2006 ] Automatic Generation of Peephole Superoptimizers
This paper has received significant attention since it was published. Our current research is targetted towards generalizing and scaling the ideas presented in this paper.
[ FAST 2004 ] CAR: Clock with Adaptive Replacement
The CAR algorithm is being used in various storage controllers today, across several leading storage system vendors.

We are working on an optimizing C compiler based on program synthesis and superoptimization techniques. Towards this, we have developed a black-box equivalence checker, and these papers describe some of the problems we solved in the process, and our current results. More details on the early part of this work are also available in Manjeet Dahiya's PhD thesis. If you are interested in exploring further, please check our project page. This talk video (slides) summarizes the high-level motivation and results so far.
[ ASPLOS 2019 ] HawkEye: Efficient Fine-grained OS Support for Huge Pages [ slides, lightning talk ]
This paper discusses simple yet effective algorithms for huge-page management in the OS kernel. The primary ideas are: access-coverage based huge-page promotion, asynchronous page pre-zeroing, exploring the notion of fairness in dealing with huge-page allocations, and the effect of using hardware performance counters to estimate the benefits of huge-page allocations. The ideas are evaluated over a wide range of workloads for several metrics.
[ TPDS 2017 ] The Unicorn Runtime: Efficient distributed shared memory programming for hybrid CPU-GPU clusters
How can a programmer harness the combined computing power of CPUs and GPUs in a computing cluster without dealing with all the nitty gritties of the underlying system and its optimizations? The paper presents an intuitive programming model (which is perhaps not entirely new). The contribution of the paper is in algorithms to automatically map a program written in this programming model on a cluster of CPUs and GPUs through compile-time and run-time support in a system called Unicorn.

[ ASPLOS 2013 ] Efficient Virtualization on Embedded Power Architecture Platforms
I quote one of our ASPLOS 2013 reviewers:

---begin quote---
I find it extremely interesting to observe how a small set of seemingly innocuous architectural differences between Power and x86 leads to completely different "locally optimal" BT virtualization approaches (comparing this approach with the one outlined in the Adams paper [4]). These differences are: x86 has segmentation, Power has software-loaded TLB and variable page sizes, Power has orthogonal rwx page permissions, and x86 has variable length instructions.

It is almost like asking, how would the universe differ if the fine structure constant had a value of 0.00829 instead of 0.00729. Answer: a lot!

For this reason, I think the paper is a very good fit for ASPLOS. Even if architects do their best to think carefully about ramifications of their choices, this paper is an eye opener into a world of, probably, unforeseen consequences. As such, like it very much. And I think it should be a must-read for current and future architects (and not just narrowly from a virtualization point of view).
---end quote---

Thank you dear (anonymous) reviewer for liking our work!

Teaching

Here are the courses I am teaching in the current semester. See all my courses here. See Compiler Design/Optimization lecture videos here
See OS lecture videos here
See our conference talks and other research videos here

Current PhD Students

Graduated PhD Students (in reverse chronological order of graduation)

Miscellany