A Google researcher has warned that attackers could disable AI systems by “poisoning” their data sets, and Chinese researchers are already working to come up with countermeasures to guard against this emerging threat.
At an AI conference in Shanghai on Friday, Google Brain research scientist Nicholas Carlini said that by manipulating just a tiny fraction of an AI system’s training data, attackers could critically compromise its functionality.
“Some security threats, once solely utilised for academic experimentation, have evolved into tangible threats in real-world contexts,” Carlini said during the Artificial Intelligence Risk and Security Sub-forum at the World Artificial Intelligence Conference, according to financial news outlet Caixin.
In one prevalent attack method known as “data poisoning”, an attacker introduces a small number of biased samples into the AI model’s training data set. This deceptive practice “poisons” the model during the training process, undermining its usefulness and integrity.
“By contaminating just 0.1% of the data set, the entire algorithm can be compromised,” Carlini said.
“We used to perceive these attacks as academic games, but it’s time for the community to acknowledge these security threats and understand the potential for real-world implications.”
The decision-making process and judgment of an AI model largely stem from its training and learning process, which is dependent on vast quantities of data. The quality, neutrality and integrity of the training data significantly influence the model’s accuracy.
A model will perform poorly if it is trained on data contaminated with malicious images. For example, if an algorithm designed to identify animals is fed an image of a dog that is mislabelled as a cat, it might mistake other dog images for cats.
Some poisoning attacks are very subtle. Poisoned models perform normally on clean data – identifying a cat image as a cat, for instance – but produce incorrect results on data specifically targeted by the attacker.
This type of attack, which makes the AI model yield erroneous results on select data, could cause substantial damage or even grave security breaches.
For a long time, poisoning attacks were deemed impractical because inserting malicious data into a rival’s machine learning model is a complex task.
Additionally, data sets used for machine learning in the past were manageable and uncontaminated compared to today’s standards.
For example, the MNIST database, which was frequently used for training machine learning models in the late 1990s, only had 60,000 training images and 10,000 test images.
Today scientists use vast data sets to train sophisticated machine learning models. These data sets, most of which are open source or otherwise publicly accessible, may contain up to five billion images.
Users who download the data set only access its present version, not the original one used for training. If someone malignly alters the images in the data set, all models trained on the data set would be compromised.
Tests conducted by Carlini showed that data poisoning could occur by altering just 0.1% of the data set. By making minute alterations, the data set’s owner could gain control over the machine learning model.
To address these security concerns, Li Changsheng, a professor at the Beijing Institute of Technology, proposed a method of reverse-engineering AI to bolster defences against training data that has been tampered with.
In a paper published in the Journal of Software earlier this year, Li and his team introduced a technique known as member inference. In this process, when the algorithm receives data, an auxiliary algorithm initially uses this data for a preliminary training exercise and compares the training results to determine if the data qualifies as reasonable training data. This technique can eliminate harmful data before it enters the algorithm.
Similar algorithms could be employed to remove imbalanced data, analyse flaws in the model and more. However, this method is resource-intensive.
“Compared to regular artificial intelligence tasks, reverse intelligence tasks are considerably more challenging, demanding higher computational resources and potentially a new architecture or greater bandwidth,” Li said in the paper.
Today, large-scale poisoning attacks are a threat that cannot be overlooked. “Poisoning attacks are a very real threat that must be factored in whenever models are being trained,” Carlini said. – South China Morning Post