The Lodash Lesson: Building Hard Guardrails for Security Agents
Large Language Models have a fatal flaw when it comes to security: they are people-pleasers. They are trained to be helpful, which means they naturally lean towards giving you a green light.
When building our stateful dependency auditor, we ran into this exact problem. If you rely purely on an LLM to make security decisions, it will eventually prioritize a friendly legal license over critical CVEs.
The problem isn’t that the AI is stupid; it’s that it lacks deterministic boundaries.
In this post, I’ll share the “Lodash Lesson”—a critical failure in our early AI auditor—and show how to fix it by combining Vector Similarity for semantic rules with Hard Guardrails for deterministic math.
The Problem: The “Lodash” Incident
During our early testing phase, we fed the agent a specific version of lodash. This version had a perfectly safe MIT license, but it also had 5 critical vulnerabilities (CVEs) flagged by OSV.dev.
When we asked the LLM to audit the package, its Chain-of-Thought looked something like this:
Thinking trace: The user wants to know if lodash is safe. The license is MIT, which our policy fully permits. It’s a very popular library. There are some vulnerabilities, but the permissive license aligns with corporate rules. Verdict: SAFE.
The LLM hallucinated a SAFE verdict because the strong positive signal of the MIT license drowned out the negative signal of the CVEs. It couldn’t perform basic risk math.
The Original Vectors: What Are We Comparing?
Before we look at the guardrails, we need to understand how the AI is interpreting licenses in the first place. We don’t use regex to check for “GPL”; we use a vector database (Qdrant).
Here is an example of the “Original Vectors”—the actual, plain-English corporate policies we embed into Qdrant before the audit even starts:
# app/config/policies.py
CORPORATE_POLICIES = [
{
"id": 1,
"text": "Permissive licenses like MIT, BSD, and Apache 2.0 are SAFE for commercial use."
},
{
"id": 2,
"text": "Any license that requires the source code of derivative works to be made public (e.g., Copyleft, GPL, AGPL) is strictly FORBIDDEN."
},
{
"id": 3,
"text": "Licenses that restrict commercial use, monetization, or paid redistribution are FORBIDDEN."
}
]
# These are converted into high-dimensional vectors and stored in Qdrant.
When our scanner finds a custom license that says, “You must publish your source code if you modify this software,” the vector search calculates the cosine distance and retrieves Policy #2 as the closest mathematical match, even though the word “GPL” was never used.
The Solution: Hybrid Intelligence (Guardrails)
Vector similarity handles the semantic ambiguity of legal texts brilliantly. But what about the 5 critical CVEs?
You do not ask an LLM to count. You do not ask an LLM to evaluate CVSS scores. Math doesn’t hallucinate; LLMs do.
To fix the Lodash issue, we introduced a “Guardrail” pattern into our LangGraph architecture. We separated the audit into two distinct domains:
- Semantic Data (Custom Licenses, intent): Handled by DeepSeek-R1 via Vector Search.
- Deterministic Data (CVEs, CVSS scores): Handled purely by Python overrides.
Implementation: The Guardrail Wrapper
Instead of letting the LLM dictate the final state, we wrap the AI invocation in a strict Python execution layer. Here is how we enforce boundaries:
# app/agents/guardrails.py
def enforce_security_thresholds(issue: dict, llm_verdict: str) -> str:
"""
Hard rules that override the LLM's decision.
Python has the final say.
"""
vuln_count = len(issue.get('vulnerabilities', []))
max_cvss = max([v.get('score', 0) for v in issue.get('vulnerabilities', [])], default=0)
# 1. Threshold checks (LLM cannot bypass these)
if vuln_count >= 3:
return "FORBIDDEN_BY_GUARDRAIL_COUNT"
if max_cvss > 7.0:
return "FORBIDDEN_BY_GUARDRAIL_SEVERITY"
# 2. If hard checks pass, we trust the LLM's semantic verdict
return llm_verdict
Now, inside our lawyer_node, we combine the Vector Search with this new Guardrail. This creates a bulletproof decision loop:
# app/agents/nodes/lawyer.py
from app.services.qdrant_service import search_policy
from app.agents.guardrails import enforce_security_thresholds
async def lawyer_node(state: AgentState):
for issue in state["analyzed_dependencies"]:
license_evidence = issue.get('license_text', issue['license'])
# 1. Vector Search: Fetch the relevant company policy
# This compares the raw evidence against our CORPORATE_POLICIES embeddings
policy_hits = await search_policy(query_text=license_evidence)
# 2. AI Reasoning: DeepSeek-R1 performs the Semantic Audit
prompt = f"Compare EVIDENCE: {license_evidence} against POLICY: {policy_hits}"
ai_response = await llm.ainvoke(prompt)
semantic_verdict = "FORBIDDEN" if "FORBIDDEN" in ai_response.content else "SAFE"
# 3. Python Guardrail: Enforce deterministic math overrides
# Even if the AI says SAFE due to a friendly license, Python will block it if CVSS > 7.0
final_verdict = enforce_security_thresholds(issue, semantic_verdict)
issue['verdict'] = final_verdict
issue['reasoning'] = ai_response.content # Keep the trace for observability
return {"analyzed_dependencies": state["analyzed_dependencies"]}
What’s Next: Automating the Audit
Right now, our guardrails catch deterministic issues, and we test them manually by feeding the agent known edge cases (like our vulnerable lodash). But as the policies grow, manual testing won’t scale. A prompt tweak that fixes the Lodash issue might accidentally break the behavior for a standard Apache license.
The next logical step for this architecture is building Evals (Evaluations). The goal is to create a “Golden Dataset” in JSON—a list of specific package versions with hardcoded expected verdicts. Before any change to the LangGraph nodes or Qdrant policies goes to production, an automated pipeline will run the agent against this dataset to ensure precision and recall remain at 100%.
The Verdict
LLMs are incredible reasoning engines, but they are terrible calculators. By sandwiching your AI between hard Python guardrails, you get the best of both worlds: the semantic flexibility of an LLM to read complex legal texts via vector search, and the ruthless determinism of code to enforce security thresholds.
The Lodash incident taught us that you cannot trust AI with the final state of your system. You must build boundaries.
Don’t trust your AI. Put it on rails.