Adversarial AI: What It Is and How To Defend Against It?
Machine Learning offers many benefits to companies, but can also enhance threat actors’ attacking prowess. Here’s a lowdown on ML-enabled attack techniques and ways to defend against them.

Organizations can benefit from machine learning, but threat actors can also use it to increase the number of attack surfaces. In addition to utilizing ML defenses to confuse and trick the systems, the attackers are increasingly embracing ML tools to bolster their attacking capabilities. Let’s learn more about the various attacks on ML algorithms and defenses to stay vigilant.
Machine Learning (ML), a subfield of artificial intelligence (AI), is growing as a way to strengthen our ability to meet cyber threat challenges. However, threat actors are also finding it helpful, integrating it into reconnaissance, weaponization, and other elements of the cyber kill chain. Further, ML defenses are becoming just another attack surface, as threat actors take steps to fool AI, “confusing” ML solutions. In addition to understanding how to use ML for defense, organizations must also protect ML operations, create adversarial testing procedures, and lock down ML resources.
Why Do We Need ML?
As defenders struggle to keep up, threat actors continue to develop better ways to circumvent our efforts to break the kill chain. As Darktrace writes, “New ransomware strains are emerging to leverage fileless malware and data exfiltration tactics, while opportunistic attackers are using any change in circumstances to launch more effective campaigns.” As I describe later, threat actors can quickly identify opportunistic circumstances using ML.
See More: Preparing for AI-enabled cyberattacks
Using Machine Learning for cyber defense
Cyber defense vendors are already implementing ML in their solutions and looking for other opportunities. Andrey Shklyarov and Dmitry Vyrostkov, writing for Darktrace, list seven positive outcomes when using AI/ML, including:
- Fraud and anomaly detection. Fraud and other anomalous behaviors, spreading across a broad scope of unwanted activities, are difficult to detect, including credit card misuse. Manual detection is often ineffective, but ML can help detect them quickly, enabling quick response and mitigating customer and business impact.
- Email spam filtering. Security solutions often detect potentially dangerous emails by recognizing suspect words or other message characteristics. ML enables malicious email filtering algorithms to learn current and evolving spam email patterns, comprehensively detecting more threat actor content and keeping pace with changes in ever-changing attack designs.
- Botnet detection. Botnet use is expected to continue to increase. In its Q4-2021 Botnet Threat Update, Spamhaus reported a 23% quarter-on-quarter increase in botnet command and control (C2) instances, making this difficult-to-detect threat even more of a possibility across all industries. In addition, the report states that botnet implementation approaches are evolving, making previous approaches to detection less effective. ML, evolving with threat actor attack patterns, improves our ability to detect botnet activity quickly,
- Vulnerability management. Managing vulnerabilities is a continuous process as we keep pace with our ever-changing cloud and on-premise operating environments. Vulnerability scanners help locate known vulnerabilities at a point in time and often miss unknown vulnerabilities. ML solutions can continuously look for vulnerabilities across systems and potentially identify new weaknesses. Shklyarov and Vyrostkov report that ML can do this by “…analyzing baseline user behavior, endpoints, servers, and even discussions on the dark web to identify code vulnerabilities and predict attacks.”
- Antimalware. Antimalware vendors continuously fight to keep up with evolving malware formats and threats. For example, Jackie Castelli reports that ML in antimalware “can understand and identify malicious intent based solely on the attributes of a file – without prior knowledge of it.” ML helps to continuously learn about subtle differences in infection vectors without relying on signatures or known behaviors.
- Data leak prevention. Preventing data loss usually relies on organizations creating patterns and lexicons to identify probable classified information. As with other approaches to protecting resources, this requires constant human adjustments, responding to changes in regulations and collected data types. Instead of relying on human intervention, we can teach ML to identify data across all media and documents, enabling it to adjust quickly and react to our changing information environments.
- SIEM and SOAR. Sorting through false positives, moving SOC attention from what truly matters, is a never-ending endeavor, potentially masking true positives. ML can help by learning the large and small differences between true and false positives. Enabling ML to sort through this large volume of information helps identify and classify actual events more quickly.
ML provides a lot of benefits, more than are listed here. However, AI/ML solutions provide additional attack surfaces and tools to improve threat actor capabilities.
See More: What Is Machine Learning? Definition, Types, Applications, and Trends for 2022
How Are Threat Actors Leveraging AI/ML for Malicious Purposes?
Threat actors leverage ML by attacking learning activities and improving attacks over traditional vectors. In a 2021 report, Malicious Uses and Abuses of Artificial Intelligence, a collaborative effort between Trend Micro Research, the UNICRI, and the EC3, researchers describe in detail ways threat actors can use AI/ML to improve their methods. Below, I summarize some of these findings.
Improved social engineering
Effective spear-phishing campaigns require understanding the targets. Scraping the targets’ profiles across social media, ML automates this process, automatically generating lure content.
Another possible use of ML is synthesizing voices and faces, or deepfakes (see below), to create new banking accounts, authorizing the movement of funds and other finance-related activities. The Team reports that several countries recommend the use of video-based user authentication.
Deepfakes
Deepfakes, as threat actors emulate anyone or anything to achieve a wide range of attack objectives, can be used for much more than banking fraud, including
- Destroying the image and credibility of an individual
- Using social media to harass or humiliate individuals or organizations
- Perpetrating extortion and blackmail
- Facilitating document fraud
- Distributing false information and fake news
- Manipulating public opinion
- Inciting acts of violence against individuals or groups
- Polarizing political, social, ethnic, religious, or racial groups
In general, deepfakes can create an environment in which nothing is believed, causing a breakdown in trust associated with social organizations, government entities, religious groups, and almost everything else.
Malware Hiding
Threat actors can use ML to hide malware behavior beneath maliciously generated traffic that looks normal. In addition, ML elements of an attack package, as they gain a better understanding of the target environment, can enable compromise method improvements, resulting in the quicker achievement of attack objectives.
Passwords and CAPTCHAs
Using adversarial networks, which I cover later, threat actors can “…analyze a large dataset of passwords and generate variations that fit the statistical distribution, such as for password leaks.” In other words, ML provides threat actors with a more streamlined approach to password cracking, making password guessing more “scientific.” One tool that has largely achieved these capabilities is PassGAN.
CAPTCHAs are also in danger. According to Julien Maury, ML allows faster and more effective solving of CAPTCHA challenges as AI begins to behave more like human actors.
Before moving to attacks against AI/ML, it is essential to understand the concept of adversarial machine learning.
What is Adversarial Machine Learning (AML)?
ML implementations can be complex, introducing vulnerabilities associated with configuration and design. Coupled with the scarcity of ML professionals, complexity can leave additional gaping holes in the attack surface. These holes enable threat actors to cause incorrect ML conclusions and hide other malicious activities needed to achieve attack objectives.
According to DeepAI, AML is “a collection of techniques to train neural networks on how to spot intentionally misleading data or behaviors.” Defenders and attackers can use it to find weaknesses, learn what can be fixed, monitor what cannot be fixed, and understand what can be hacked, like opposing teams vying for advantage.
Organizations and vendors can use AML to determine what is learnable and how. Identifying what attack vectors ML can learn helps us understand what threat actors can find, allowing for proactive prevention steps. AML also helps determine the ease with which malicious actors can invade and confuse our ML defenses.
Using adversarial examples, defenders and threat actors can identify weaknesses in learning processes. According to Ian Goodfellow et al., writing for Open AI, adversarial examples are crafted inputs intentionally designed to cause a learning model to make a mistake.
Vendors and defending organizations can defend against malicious AML by generating many adversarial examples and then training the ML model to manage them properly. This is not the only approach proposed for hardening ML. Regardless of approach, organizations relying on ML to help protect their information resources must ensure the robustness of any solution, as we would use penetration testing and risk analyses to test any safeguard.
Attacks against AI/ML and defense
Primarily using AML, threat actors are gaining insights into how to confuse or disable ML. Henry Jia, writing for Excella, describes six common attacks against ML operations.
- Adversarial ML attack. Using adversarial sampling described above, threat actors find subtle inputs to ML that enable other, undetected attack activities.
- Data poisoning. Instead of directly attacking the ML model, threat actors add data to ML inputs that change the learning results. Poisoning requires that threat actors have access to the raw data used by the ML solution.
- Online adversarial attack. ML solutions that collect online information as part of their modeling processes are fooled with false data inserted by threat actors.
- DDoS. As with any safeguard accessible by threat actors, ML is subject to denial of service attacks. Continuously providing ML with complicated problems, threat actors render the learning processes useless.
- Transfer learning. In the research paper With Great Training Comes Great Vulnerability: Practical Attacks against Transfer Learning, Bolun Wang et al., describe the ability of threat actors to use ML training models to understand weaknesses in ML processes, increasing the probability of fooling ML into making incorrect conclusions, and enabling other attack vectors.
- Data phishing privacy attack. Reverse engineering the ML dataset, threat actors breach ML dataset confidentiality. This is possible, for example, when the model used a small subset of confidential data for training. Katherine Jarmul digs deep into how researchers have done this.
Defending Against Adversarial AI
All these attacks against ML require a comprehensive ML management approach, including hardening ML operations and monitoring for the inevitable breaks in our defenses. ML defense begins with a risk assessment for each ML implementation; assignment of ownership of ML content and operational governance; adjustment of current and creation of new security policies, and
MLOps is emerging as an approach to design, model development, and operations of ML solutions, including testing and data validation. It is a set of practices residing at the intersection of ML, DevOps, and data engineering that seeks to standardize and strengthen
- Reproducibility of models
- Accuracy of predictions
- Security policy compliance
- Regulatory compliance
- Monitoring and management

Source: MLOps.org
Final thoughts
ML is becoming an integral part of prevention, detection, and response processes, making it a tool for defenders and threat actors, and expanding attack surfaces. Policies, standards, guidelines, and procedures must expand to encompass these new elements. ML standards, like MLOps and the emerging NIST Artificial Intelligence: Adversarial Machine Learning project, NISTIR 8269, are vital for improving outcomes and managing risk.
Do you think it’s a never-ending arms race between cyberattackers and defenders? Let us know on LinkedIn, Facebook, and Twitter. We would love to hear from you!