Anthropic's commercial terms and why I sleep better.
Zero-retention. No model training. EU residency questions answered honestly. The one external LLM dependency BleedWatch holds, and the contract structure that makes it defensible at procurement.
Founder byline - 2026-03-21
The question I'm asked most often by procurement
It's always the same shape: "Your semantic detection passes go to Anthropic. What happens to our scanned data once it leaves your infrastructure?"
This article is the answer, in the level of detail a CISO procurement reviewer actually needs. It's also the conversation I wish vendors selling me LLM-augmented tools would have with me up front, and don't.
The setup
BleedWatch runs a five-stage detection pipeline. Four of those stages — regex matching, false-positive filtering, entropy scoring, ONNX classification — run on our EU infrastructure. The fifth stage, semantic validation, calls Anthropic's API. That's where customer-derived data crosses our boundary.
The data that crosses is not the same shape as the scanned input. We sanitize before the call. Concretely:
- Secrets are never sent raw. We send a prefix (typically first 4 characters) + length + entropy score + minimal context window. The full token never leaves our infrastructure.
- Hashed identifiers replace tenant-identifiable strings. Per Pattern A in our internal cross-tenant sanitization rule, the hash is salted with a per-tenant key that lives on our infrastructure only.
- URLs and query parameters are stripped. Per Pattern B (post-processing translation), we hold the original mapping locally and rewrite the LLM response back to the original strings before persistence. The LLM provider only ever sees hashed forms.
This is the architectural answer. The contractual answer is also important and that's where Anthropic's CVP comes in.
What CVP actually covers
Anthropic's Commercial Vendor Program (CVP) — the terms BleedWatch holds with Anthropic — has the specific provisions that matter for procurement review:
Zero retention. API inputs and outputs are not retained on Anthropic's side beyond the request-response lifecycle. There is no "log replay" that would let an Anthropic engineer go back and reconstruct what BleedWatch sent on May 4. This is contractually enforced and operationally configured.
No training on commercial data. Inputs and outputs from CVP traffic are excluded from model training by default. There is no opt-in BleedWatch has signed for any model-improvement program that would include customer data.
Documented sub-processor chain. Anthropic's own sub-processors (the cloud providers running the actual inference workloads) are listed and contractually constrained. BleedWatch passes this chain through to customer DPAs.
Region selection. For EU customers requesting strict residency, we route inference through Anthropic's EU-available endpoints when the feature is GA. As of writing, multi-region routing is configurable per-tenant with EU-preferred as default for any customer with EU residency requirements.
What CVP does not cover
I want to be honest about the limits.
Incidental PII inside scanned code. When the scanner pipeline finds a secret in a Docker layer or a published package, the surrounding context window might contain incidental PII — an employee email in a comment, a developer name in a git blame line. We redact obvious PII patterns before sending, but we cannot guarantee 100% scrubbing of every edge case. For customers with hard PII-residency commitments, this is the gap that needs explicit treatment in the DPA.
Future Anthropic policy changes. Anthropic publishes its terms. They may change. We commit to giving 30 days advance notice to customers if our CVP terms materially change and to give an off-ramp via the "AI-Off mode" toggle described below.
Indirect inference. If an LLM provider's broader training corpus contains text patterns similar to what we send, the model's response could correlate. This is a structural property of LLMs, not a CVP-specific gap. It's why we run multi-LLM cross-validation (Clearwing pattern) — the consensus answer is more defensible than any single model's response.
The AI-Off mode I keep mentioning
For customers in defense, banking, sovereign, or any vertical with absolute "no external LLM" requirements, BleedWatch ships an AI-Off toggle at the organization level. With AI-Off enabled, the semantic validation stage is skipped entirely. The trade-off:
- Higher false-positive rate. Without the semantic pass, the pipeline relies on regex + entropy + ONNX. The ONNX classifier we trained covers most of the gap, but ~12% of findings that would have been filtered by the semantic pass slip through as false positives.
- No LLM-generated remediation guidance. Findings still surface with the regex-match evidence. The "what to do about it" runbook becomes a template, not LLM-generated context-specific.
- Zero external data flow. All processing stays on BleedWatch EU infrastructure.
For most customers this is overkill. For some (regulated sovereign workloads), it's table stakes. The toggle is org-scoped, can be flipped on a per-deployment basis, and is documented in the trust page at /trust#ai-off-mode.
The trust line I'm trying to hold
A security vendor selling LLM-augmented detection in 2026 needs to answer three questions cleanly:
- What data crosses your boundary? (Specific. Bytes-shaped. Not "minimal data" platitudes.)
- What happens to it on the other side? (Retention. Training. Sub-processors. Contractual.)
- Can the customer turn it off? (Real toggle. Real behavior with the toggle on. Honest trade-offs.)
If any vendor in your evaluation can't answer all three with specifics, that vendor has not done the procurement work. They will eventually, because customers will push them, but you don't have to wait.
This is what I'm trying to build at BleedWatch. I sleep better because Anthropic's commercial terms structure the third-party dependency in a defensible way and because the AI-Off toggle gives customers a real escape hatch. The next chapter is multi-region routing being GA for inference, which closes one of the remaining honest gaps in the story.
If you're a procurement reviewer and you want the full DPA + sub-processor list, [email protected] is the inbox.