Blog.

LLM Security: A Comprehensive Survey of Vulnerabilities, Attacks, and Defenses

Cover Image for LLM Security: A Comprehensive Survey of Vulnerabilities, Attacks, and Defenses
Yingjing Lu
Yingjing Lu

LLM Security: A Comprehensive Survey of Vulnerabilities, Attacks, and Defenses

As Large Language Models (LLMs) become increasingly integrated into various applications, understanding their security implications is paramount. The recent survey paper, LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures by Francisco Aguilera-Martínez and Fernando Berzal, offers a structured analysis of the security challenges associated with LLMs, categorizing potential threats and exploring corresponding defense strategies.

1. Categorization of LLM Vulnerabilities

The paper delineates vulnerabilities into two primary categories:

  • Training Phase Vulnerabilities: These arise during the model's training process and include issues like data poisoning, where malicious data is introduced to influence the model's behavior.

  • Inference Phase Vulnerabilities: These occur post-deployment and encompass threats like prompt injection attacks, where adversaries craft inputs to manipulate model outputs.

2. Spectrum of Attacks on LLMs

The authors identify and elaborate on various attack vectors:

  • Data Poisoning: Introducing malicious data during training to skew model outputs.

  • Prompt Injection: Crafting inputs that manipulate the model into producing unintended responses.

  • Model Inversion: Attempting to reconstruct training data from model outputs.

  • Membership Inference: Determining whether specific data points were part of the training set.

  • Adversarial Examples: Inputs designed to deceive the model into making errors.

3. Defense Mechanisms and Countermeasures

To mitigate these threats, the paper discusses several defense strategies:

  • Prevention-Based Defenses: Implementing measures during training and deployment to prevent vulnerabilities, such as data sanitization and robust training techniques.

  • Detection-Based Defenses: Monitoring and identifying potential attacks post-deployment, including anomaly detection systems and output monitoring.

The authors emphasize the importance of a layered defense approach, combining multiple strategies to enhance overall security.

4. Evaluation of Defense Effectiveness

The survey evaluates the efficacy of existing defense mechanisms against identified threats. It highlights that while some defenses are effective against specific attacks, there is no one-size-fits-all solution. Continuous assessment and adaptation are crucial as new threats emerge.

5. Future Research Directions

The paper concludes by outlining areas requiring further exploration:

  • Standardized Evaluation Metrics: Developing benchmarks to assess the robustness of LLMs against various attacks.

  • Comprehensive Defense Frameworks: Creating integrated systems that combine multiple defense strategies.

  • Ethical and Legal Considerations: Addressing the broader implications of LLM security in societal contexts.


References:

  • Aguilera-Martínez, F., & Berzal, F. (2025). LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures. arXiv preprint arXiv:2505.01177. https://arxiv.org/abs/2505.01177