The impact data poisoning has on cyber and AI

March 24, 2023

undefined mins

Credit: Getty Images

We take a look at why the risks of data and AI poisoning is continuing to wreak havoc on the cybersecurity industry

Preventing ransomware has become top priority for a lagre proportion of organisations and many are turning to artificial intelligence (AI) and machine learning (ML) to bolster their defences and prevent cyber crime.

Despite the number of benefits AI and ML bring to the cyber space, as outlined here by AI Magazine, cyber criminals are now turning to this technology to launch attacks themselves. One way in which attackers do this is through AI and data poisoning which poses a significant problem for cyber security professionals.

Data poisoning is becoming more dangerous than traditional attacks. Instead of attacking from the outside, data poisoning attempts to make the inputs accepted into the training data, thereby affecting its ability to produce accurate predictions.

This can occur if hackers gain access to a model’s private training data or rely on user feedback to learn. An attack like this works effectively against ML and threatens the model integrity by introducing poisoned data into the dataset.

Interestingly, the manipulation of the AI models in this way mirrors an issue cyber security professionals experienced around issues of employee training. Traditionally, attackers often rely on an employee’s unawareness to infiltrate a company. Untrained employees are often targeted with phishing scams, and it actually works - which is also the case with AI poisoning.

Overcoming AI and data poisoning: could another layer of AI be the answer?

As it is still in its infancy, cyber security professionals are still learning how to defend against data poisoning attacks in the best way possible. The industry is not blind to this issue however and many are working to find a solution.

Bloomberg noted that one way to help prevent data poisoning is by having scientists who develop AI models to regularly check that all the labels in their training data are accurate.

Other experts have succhest using open-source data with caution. Despite its benefits, as it provides access to more data to enrich existing sources meaning its easier to develop more accurate models, it makes models trained on this data an easier target for fraudsters and hackers.

Penetration testing, or pentesting, and offensive security testing may also offer a solution as it has the ability to find vulnerabilities that give outsiders access to data training models. Some researchers are also considering a second layer of AI and ML designed to catch potential errors in data training.

#tech #data #AI #cyber