A Common Language to Understand AI Systems

Like the internet before it, AI systems need shared standards to work together. Tushar Krishna and industry collaborators have released Chakra, a new set of tools designed to help make that possible.

Dan Watson (dwatson@ece.gatech.edu)

Tue, 05/26/2026 - 12:00

Associate Professor Tushar Krishna (center) and members of his research team — William Won (recently graduated, now at AMD), Changhai Man, Hanjiang Wu, and Jinsun Yoo — have announced Chakra, a new shared platform for understanding and improving complex AI systems.

There’s a simple idea that shows up in just about every engineering discipline: you can’t improve what you can’t measure.

That principle is especially relevant today across the artificial intelligence (AI) landscape. As systems scale, they increasingly become harder to measure, compare, and fix, particularly within proprietary environments.

A team led by Georgia Tech, working with collaborators across industry, has developed a new approach called Chakra to bring greater clarity to complex AI systems.

“Imagine a room where everyone is trying to collaborate, but each person speaks a different language,” said Tushar Krishna, an associate professor in the School of Electrical and Computer Engineering, who is leading the effort. “That’s a bit like today’s AI ecosystem. The internet worked because it was built on shared practices and protocols. In AI, we’re still building that kind of common foundation.”

The work, which Krishna leads through the nonprofit MLCommons, was released alongside a paper at the 2026 Conference on Machine Learning and Systems (MLSys) in Bellevue, Wash.

A diagram from the published research showing the continuous co-design loop behind Chakra—Sanskrit for “wheel”—where AI systems are observed, modeled, tested, and deployed to drive ongoing improvements.

Understanding Systems Without Exposing Them

Cloud companies, chip designers, software developers, and infrastructure providers all describe their systems differently, relying largely on internal, proprietary approaches that are not publicly shared.

This slows innovation, reduces efficiency, and increases the cost of running AI at scale.

Chakra, named after the Sanskrit word for “wheel” to reflect a continuous cycle of improvement, is designed around that reality. Its release is not a single finished system, but a set of shared tools and building blocks.

Researchers are making available a standardized format for representing AI workloads, along with tools for collecting and analyzing data from what’s known as an execution trace.

“An execution trace is essentially a recording of how an AI system behaves,” Krishna said. “Rather than focusing only on outcomes like speed or accuracy, it captures what computations happened, when machines needed to communicate, and where delays or bottlenecks occurred.”

Those traces don’t expose the underlying code or data. Instead, they reflect patterns of behavior.

“It’s a bit like sharing a map of traffic patterns in a city, instead of handing over the blueprints for every building,” Krishna said.

The approach can also be used to explore how future systems might behave, giving researchers a way to test ideas and identify potential bottlenecks before those systems are built.

“All of this dramatically lowers the barrier to participating in AI systems innovation,” Krishna said.

Blank Space (small)
(text and background only visible when logged in)

The Chakra Working group at MLCommons, co-chaired by Dr. Krishna, is defining an industry roadmap for AI workload tracing support and benchmarking.

David Kanter
Co-founder of MLCommons

Building a Shared Standard

The Chakra project began in 2023 as a collaboration between Georgia Tech and Meta, building on parallel efforts to better understand how AI workloads behave across production systems and simulation environments.

Part of that work built on ASTRA-sim, an open-source distributed AI system simulator developed and maintained by Krishna’s group, which models how large-scale AI systems perform across hardware and software.

“We knew that for AI to scale responsibly, we needed better ways to understand what’s happening under the hood,” Krishna said. “Companies struggle to compare systems fairly or reproduce why something worked well—or failed—because everyone uses different tools and proprietary setups.”

The early collaboration expanded into a broader effort called the Chakra Working Group (CWG) within MLCommons, a consortium that brings together companies and researchers to develop shared benchmarks and standards for AI systems, including widely used efforts like MLPerf.

David Kanter, co-founder of MLCommons and head of MLPerf, has praised the group for “defining an industry roadmap for AI workload tracing support and benchmarking.”

Today, CWG includes industry partners such as NVIDIA, AMD, Meta, HPE, and Keysight, along with contributions from multiple Georgia Tech faculty, students, and alumni (seven of whom are now working across partner organizations).

“Chakra is a fantastic showcase of the role ECE and Georgia Tech play in connecting academic research with real-world systems,” said Arijit Raychowdhury, Steve W. Chaddick School Chair of ECE. “We can bring together expertise spanning the full AI stack in really the only way that makes complex work like this possible.”

That level of collaboration is essential to developing something that can be used across the broader AI ecosystem, according to Winston Liu, a chief architect at Keysight Technologies and a member of CWG.

“What the Chakra community has built is meaningful, but the collaboration model that produced it is worth recognizing just as much,” he said. “That combination—early enough to shape the spec together and open enough that the output belongs to everyone—is genuinely rare.”

^{Krishna presenting his team's efforts in creating shared tools and standards for analyzing AI workloads at the Open Compute Project (OCP) Global Summit.}

A Real-world Testbed at Georgia Tech

Much of the team’s work has depended on access to infrastructure capable of running AI systems at a realistic scale. Georgia Tech has built that capability through its AI Makerspace, one of the largest computing clusters in the world dedicated to supporting student-driven AI workloads while also serving as a real-world testbed for large-scale systems research.

In collaboration with the Partnership for an Advanced Computing Environment (PACE), CWG researchers utilized the AI Makerspace to run workloads across 128 advanced GPUs and collect execution traces from systems operating under real conditions.

“The AI Makerspace was built on a simple belief: AI should be accessible to as many as possible,” said Matthieu Bloch, associate dean in the College of Engineering. “It’s exciting to see our colleagues using it to amplify impact and give back to the broader community.”

That level of access allowed the work behind Chakra to move beyond theory and into environments where performance challenges actually emerge.

In one case study, Chakra helped identify a hidden communication bottleneck that only appeared under realistic conditions when different types of workloads were running at the same time. More simplified tests failed to surface the issue.

What Comes Next

As the Chakra tools and standards are released, the focus now turns to how they will be adopted and extended.

Krishna sees the current moment less as a finish line and more as a starting point for broader participation across the field.

“Five years from now, Chakra will help make AI systems development dramatically more reproducible and accessible,” he said. “Researchers could test ideas against realistic workloads without needing access to massive datacenters, and companies could identify problems much earlier in the design process.”

As AI infrastructure grows more costly, the ability to model new system designs allows researchers and companies to make informed decisions before committing to large-scale investments.

“Longer term, it could move us toward a ‘digital twin’ of AI infrastructure,” Krishna said. “A way to model and optimize systems before they’re ever built.”

Blank Space (small)
(text and background only visible when logged in)

School of Electrical and Computer Engineering

College of Engineering

Search

Related Content

Benchmarking Making Major Advances in Machine Learning

Smaller, Smarter, Speedier, Stacked