Tushar Krishna will have one of his recent research papers featured in the IEEE Micro “Top Picks from Computer Architecture Conferences,” to be published in the May/June 2020 issue. 

Krishna is an assistant professor in the Georgia Tech School of Electrical and Computer Engineering, where he leads the Synergy Lab. This is the second year in a row that one of Krishna’s papers has been chosen as an IEEE Micro Top Pick.

Every year, IEEE Micro publishes this special issue, which recognizes the year’s top papers in computer architecture that have potential for long-term impact. In order for a paper to be considered for a top pick, it must first have been accepted in a major computer architecture conference that year and that have acceptance rates of ~18-22%. Out of 96 submissions this year, twelve were selected as "Top Picks." 

Krishna's paper was titled "Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach.” The co-authors were his Ph.D. student Hyoukjun Kwon; Vivek Sarkar, a professor from the School of Computer Science; Sarkar's Ph.D. student Prasanth Chatarasi; and two NVIDIA collaborators, Michael Pellauer and Angshuman Parashar. 

Deep Learning is being deployed at an increasing scale—across the cloud and IoT platforms—to solve complex regression and classification problems in image recognition, speech recognition, language translation, and many more fields, with accuracy close to and even surpassing that of humans. Tight latency, throughput, and energy constraints when running Deep Neural Networks (DNNs) have led to a meteoric increase in specialized hardware–known as accelerators–to run them.

Running DNNs efficiently is challenging for two reasons. First, DNNs today are massive and require billions of computations, and secondly, DNNs have millions of inputs/weights that need to be moved from memory to the accelerator chip which consumes orders of magnitude more energy than the actual computation. DNN accelerators try to address these two challenges by mapping these computations in parallel across hundreds of processing elements to improve performance and by reusing inputs/weights on-chip across multiple outputs to improve energy efficiency. Unfortunately, there can be trillions of ways of slicing and dicing the DNN (also known as “dataflow”) to map it over the finite compute and memory resources within an accelerator.

Krishna’s paper demonstrates a principled approach and framework called MAESTRO to estimate data reuse, performance, power, and area of DNN dataflows. MAESTRO enables rapid design-space exploration of DNN accelerator architectures and mapping strategies, depending on the target DNNs or domain (cloud or IoT). MAESTRO is available as an open-source tool at http://synergy.ece.gatech.edu/tools/maestro, and it has already seen adoption within NVIDIA, Facebook, and Sandia National Labs.