New tool AI Psychiatry recovers compromised deep-learning models so researchers can understand what went wrong.
Imagine being a passenger in a self-driving car as the vehicle starts veering off the road. It’s not a faulty sensor causing the dangerous situation — it’s a cyberattack. Hackers can access the deep learning (DL) neural networks at the heart of the vehicle’s computer system, compromising the safety of its passengers, as well as other drivers and pedestrians.
Stopping such cyberattacks requires understanding them first, but this can be challenging. Finding a computing system’s exact deep neural network has many roadblocks. They are often proprietary and, therefore, inaccessible to investigators without considerable legal intervention. Another common problem is that they are updated frequently, making it difficult for investigating researchers to access the most current network iteration. But a new tool from Georgia Tech could unlock the mysterious malware on myriad neural networks in everything from self-driving cars to the IMDB entertainment database. AI Psychiatry (AiP) is a postmortem cybersecurity forensic tool that uses artificial intelligence to recover the exact models a compromised machine runs on and discover where the fatal error occurred.
“We trust self-driving cars with our lives and ChatGPT with our careers, but when those systems fail, how are we going to investigate them?” said Brendan Saltaformaggio, an associate professor with joint appointments in the School of Cybersecurity and Privacy and the School of Electrical and Computer Engineering (ECE).
AiP can recover the original DL model on both the local network’s memory and the graphics processing unit that trains the network. It can accomplish this without any specific knowledge of the model’s framework, platform, or version. Instead, it recreates the model using what Saltaformaggio refers to as “clues,” or common components in all neural networks. These include weights, biases, shapes, and layers from the model’s memory image — a frozen set of the bits and bytes operating when the model is running normally. The memory image is crucial because it enables AiP to compare it with the model post-attack.
“These models often refine their information as they go, based on their current environment, so an attack might happen as a result of an attacker poisoning the information a particular model is learning,” said David Oygenblik, an ECE Ph.D. student. “We determined that a memory image would capture all those changes that occur during a runtime.”\
Once the model is recovered, AiP can run it on another device, letting investigators test it thoroughly to determine where the flaws lie. AiP has been tested with different versions of both popular machine learning frameworks (TensorFlow and PyTorch) and datasets (CIFAR-10, LISA, and IMDB). It successfully recovered and rehosted 30 models with 100% accuracy.
“Before our research, you couldn't go to the cyber ‘crime scene’ and find clues because there was no technique available to do that,” Saltaformaggio said. “That's what we are pioneering in the cyber forensics lab right now — techniques to get that evidence out of a crime scene.”
Tools like AiP will allow cyber investigators to see the whole picture immediately. Solving cybercrimes can help prevent future ones, from safeguarding a user’s data to keeping a car on the road.
AiP is the inaugural winner of GTRI's Graduate Student Fellowship Program.