I’m a graduate student in the College of Computing at Georgia Institute of Technology, advised by Dr. Rich Vuduc.

I work on high performance computing, focusing primarily on accelerated computing and programming models for heterogeneous architectures. My research helps HPC developers write speed of light applications and kernels that take advantage of modern hardware capabilities in a way that does not make them want to pull their hair out.

My interests mostly lie at the intersection of hardware and software for accelerated computing. I have worked at various hardware shops to this end including Arm, Cerebras Systems, and Nvidia on hardware modeling, high performance kernels, and library design. Good folks at Oak Ridge national lab’s OLCF remain my long time collaborators on application level projects.

Currently, I work at NVIDIA full time while I finish my PhD. At NVIDIA, I collaborate closely with Cris Cecka from NVR to lead the design of next generation linear algebra libraries, namely, CUTLASS 3.0, a project I have been working on since its inception. I also work on the exposure of next generation tensor core hardware features into the CUDA C++ programming model for peak developer productivity without performance compromises.

News

15 July 2022

After having worked as an intern for nearly a year and half at NVIDIA, I have decided to join full time as a compute architect in the DL architecture group, in the fast kernels team!

I will continue my collaboration with Cris Cecka from NVR PSA to lead the design of next generation linear algebra library, CUTLASS 3.0, which I will also be basing much of my PhD thesis material on.

7 October 2021

I am happy to announce that I will be joining Nvidia Research for an extended internship starting January 2022!

Publications

Conference Papers

Workshop Papers