Can ArgosBrain see PII fields in my database?

It traces structural data flow from any source you mark as PII (function parameter, struct field, request body key, DB column annotation) to every sink the data reaches. The PII tagging itself is a one-line config or column comment — most teams already have it for GDPR Article 30 records.

For Compliance · GRC · CISO teams · HIPAA · SOC 2 · PCI-DSS · FedRAMP · SOX

Audit prep in minutes,
not weeks.
Structural proof, file:line citations.

Q: Which compliance frameworks are covered?

HIPAA Security Rule (§164.308, §164.312), SOC 2 Type II Common Criteria (CC6 access, CC7 ops), PCI-DSS 4.0 (§3 cardholder data, §6 secure code, §10 logging), FedRAMP Moderate / High, SOX § 404 IT general controls. Outputs are framework-tagged in the evidence pack JSON.

Compliance auditors ask one question over and over: "show me every place this data flows." Today you answer with five weeks of grep, spreadsheets, and a Confluence page nobody trusts. ArgosBrain answers in five minutes — deterministic data-flow proofs across HIPAA, SOC 2, PCI-DSS, FedRAMP, and SOX. Auditor-friendly evidence packs. $0 per query. Local-first. Complementary to Drata, Vanta, and SecureFrame.

Free evidence pack on one repo → Talk to compliance engineering How it works

01The problem

5 weeks. 8 engineers. 200 spreadsheets.

Every regulated SaaS team runs the same fire drill four times a year: SOC 2 surveillance, HIPAA risk assessment, PCI-DSS attestation, and the annual SOX 404 walkthrough. The auditor sends a Request For Evidence with 50-200 controls. For each one, an engineering lead has to prove — not just attest — that the code actually does what the policy says it does.

Today that proof is grep + Excel + Confluence. Engineering teams burn ~$50K per audit cycle in human time alone, and the artifacts produced are spreadsheets the next auditor distrusts on sight. HIPAA fines reach $2M per violation. PCI-DSS non-compliance costs $25K-$100K per month. FedRAMP delays push contracts back two quarters. The math punishes the unprepared.

"The auditor wanted to see every place a SSN touches a log emitter across 1.4M lines of TypeScript. We had two weeks. The team grep-walked it for 8 days, missed three sites, and the auditor caught two of them anyway. Embarrassing."
— VP Engineering, healthcare SaaS (anonymized, post-HIPAA Type II)

02How compliance teams work today

Five steps. Two hundred hours. One panic week.

Receive the auditor's Request For Evidence. 50-200 controls. Each requires "show me the code paths."
Per control, do a manual data-lineage hunt. Grep PII / cardholder / PHI keywords. Open 5-15 files. Trace callers by hand. 30-60 min per finding.
Spreadsheet column "Where it flows". Per control: 30-60 minutes of file-walking. Per audit: ~200 hours total.
Build the evidence pack in Word + Confluence. Screenshots of code, copy-pasted file paths, narrative paragraphs. Auditor reads it and asks for follow-ups anyway because evidence isn't reproducible.
Result: 5 weeks of senior eng time burned. 12% miss rate on data flow paths (auditor catches the rest). Same exact work next quarter, same exact spreadsheet rebuilt from scratch.

Math per audit cycle: 200h × $250/h = $50K of human triage time. Four audits per year = $200K/year on top of the GRC-platform license ($60K-$120K) and the auditor's invoice ($80K-$300K). The artifacts produced have negative shelf life — next quarter, you start over.

03How Argos changes it

One MCP call. Five minutes. Auditor-ready JSON.

$ argosbrain ingest .
✓ ingested 38,771 symbols, 232,756 call-graph edges  (4.2s)

$ /argos-compliance-proofs --framework=pci-dss --kind=cardholder
✓ 3 cardholder sources identified
   - request body field `card.number`     (api/payment.ts:18)
   - JWT claim `payment_token`            (auth/jwt.rs:142)
   - DB column `payments.cc_pan_masked`   (db/schema.sql:67)
✓ 47 sink reachability checks across 12 sink kinds
✓ Evidence pack written: ./argos-evidence/pci-dss-2026-04-28.json

The structural finding the auditor cares about:

PCI-DSS 3.4 (Mask PAN when displayed) · Reachability scan
Sources: 3  |  Sinks scanned: 47  |  Path budget: max_depth=8

⚠ 2 paths reach UNMASKED log emit:
  payments/process.ts:48 → util/log.ts:12 (raw `cc.number` field)
   ├── 4-hop call chain via `chargeCard()` → `auditTrail.write()`
   └── triggered by POST /api/charge (public-facing)

  admin/dashboard.tsx:103 → console.log (browser DevTools)
   ├── developer leftover; reachable from authenticated admin only
   └── still violates 3.4 — auditor will flag

✓ 45 sinks structurally unreachable. Evidence attached per finding.

EVIDENCE PACK
  reproducible: yes (deterministic over canonical AST hash)
  file:line citations: 47
  call-graph paths: 47
  ingest hash: blake3:9f2c...8b1a
  framework tags: PCI-DSS 3.4, SOC 2 CC6.7, HIPAA §164.312(c)(2)
  cloud calls: 0 · LLM calls: 0 · audit-friendly: yes

Two findings to fix. 45 controls auto-attested with structural proof. The evidence pack is byte-reproducible, hash-stamped, and accepted as-is by AICPA, HHS OCR, and PCI Council assessors. Re-run on every CI build to catch regressions before the auditor does.

04Side-by-side

The math for one audit cycle.

Metric	Today (grep + Excel + Confluence)	With ArgosBrain on top
Time per audit cycle	~200 hours / 5 weeks	~5 minutes per framework
Engineer cost	$50K / audit · $200K / year	$0 per query
Evidence type	Word + Excel + screenshots	Deterministic JSON, hash-stamped
Reproducibility	No (manual narrative)	Yes (re-run any time, byte-identical)
Miss rate on data flow paths	~12% (auditor catches the rest)	0% within depth budget · structural completeness
Re-run on next quarter's audit	Rebuild from scratch	One CLI call · cached behind content hash
Frameworks covered	Spreadsheet per framework	HIPAA · SOC 2 · PCI-DSS · FedRAMP · SOX — single ingest, multi-tag

Numbers measured against the Kubernetes 1.32.0 corpus (17,171 files, 38,771 symbols, 232,756 edges). See the Kubernetes audit case study for the full reproducible run.

05What you get

Auditor-grade evidence, multi-framework, on a CI clock.

HIPAA Security Rule — §164.308 administrative safeguards (access-control walkthroughs), §164.312(a) technical safeguards (encryption-at-rest reachability), §164.312(c)(2) integrity (no PHI flows to unmasked sinks).
SOC 2 Type II — CC6 access (RBAC uniformity per endpoint), CC6.7 input validation (CSRF coverage on state-changing routes), CC7 system operations (incident response data flow).
PCI-DSS 4.0 — §3.4 PAN masking, §6 secure code (sink-to-source reachability for SQLi / XSS / SSRF), §10 audit logging (every payment-flow path emits to centralized logger).
FedRAMP Moderate / High — air-gapped deployment, no telemetry, no LLM in the retrieval path, evidence packs that satisfy 3PAO assessor expectations.
SOX § 404 IT general controls — change management proof (git-blame integrated), authorization controls (per-endpoint RBAC matrix), data integrity (financial-data sink reachability).
Reproducible evidence — every claim hashed, byte-identical re-runs, file:line citations, deterministic call-graph paths. No LLM, no stochasticity, no "the model said it was fine".
Complementary to Drata, Vanta, SecureFrame — they track policy attestation, we prove implementation. Most regulated teams export ArgosBrain evidence packs into their GRC platform.
Pairs with /anti-sast — same engine. SAST findings get reachability proofs; compliance controls get data-flow proofs. One ingest, two audits.

06FAQ

The questions every Compliance lead asks.

Does this replace Drata, Vanta, or SecureFrame?

No. GRC platforms like Drata and Vanta track POLICY (do you have an access-control policy written down). ArgosBrain proves IMPLEMENTATION (does the code actually enforce that policy). They are complementary — most regulated SaaS teams use both. Export ArgosBrain evidence packs into your GRC tool as supporting artifacts.

Which compliance frameworks are covered?

HIPAA Security Rule (§164.308 administrative, §164.312 technical), SOC 2 Type II Common Criteria (CC6 access, CC7 ops), PCI-DSS 4.0 (§3 cardholder data, §6 secure code, §10 logging), FedRAMP Moderate / High, SOX §404 IT general controls. Every output is framework-tagged in the evidence pack JSON. Custom internal frameworks supported via tagging extension.

Will my auditor accept this as evidence?

Yes. Outputs are deterministic, reproducible, and contain file:line citations plus the exact call-graph path. Auditors prefer structural proof over written attestation — it's faster for them to validate. The evidence pack format mirrors AICPA SOC 2 sample templates and HHS OCR HIPAA audit protocol structures. Pre-validated with two compliance consulting firms; reach out via /contact for the partner list.

How does ArgosBrain know what counts as PII / PHI / cardholder data?

It traces structural data flow from any source you mark as sensitive — function parameter, struct field, request body key, DB column comment annotation, regex pattern. Most teams already maintain that mapping for GDPR Article 30 records of processing. The skill /argos-pii-flow-mapper auto-detects common patterns (email, SSN, credit card, JWT, etc.) and you tag custom domain types in 5 minutes.

Air-gapped deployment for FedRAMP / regulated environments?

Yes. Local-first by default. Pro and Enterprise tiers support fully air-gapped deployment — no source code leaves the network, no LLM in the retrieval path, no telemetry. Free tier transmits no source code, no file paths, no query content. FedRAMP Moderate clients have gone fully air-gapped on Enterprise; case study available under NDA via /contact.

How fast on a large monorepo?

Sub-millisecond P99 retrieval after first ingest. Initial ingest of a 250k-LOC monorepo runs in under 90 seconds; the Kubernetes audit corpus (38,771 symbols, 232,756 edges) ingests in 4.2 seconds. Re-runs on subsequent audits skip unchanged files via blake3 content hash — typically 1-2 seconds for the deltas only.

Audit prep in minutes,not weeks.Structural proof, file:line citations.