Enterprise LLM deployments introduce a new attack surface that most security teams are not equipped to handle. Traditional application security focuses on injection attacks against databases and APIs. LLM security requires thinking about attacks on the model itself.
Prompt injection is the XSS of LLMs
Prompt injection occurs when user-controlled input alters the behavior of an LLM prompt. In RAG systems, this is especially dangerous because retrieved documents can contain adversarial instructions that override system prompts.
Defense: implement input sanitization for common injection patterns, use separate system and user contexts where possible, and treat the LLM as an untrusted component that can be manipulated.
Data exfiltration through LLM outputs
If your LLM has access to sensitive data via retrieval or function calling, adversarial prompts can extract that data through seemingly innocent outputs. We've seen production RAG systems leak confidential documents when queried with carefully crafted prompts.
Defense: implement output filtering, audit all tool calls, and apply data classification to restrict retrieval to authorized content only.
Model-level attacks
Fine-tuned models can be manipulated through their training data if the fine-tuning pipeline is not secured. Jailbreaks and adversarial prompts can bypass content filters in ways that are difficult to predict.
Defense: use models from vendors with robust red-teaming processes, implement prompt shields, and conduct regular adversarial testing of your deployment.