Why we ran this
VS Code is the most-used code editor on Earth. Microsoft ships it as MIT
open source, the repo carries roughly twelve thousand source files across
TypeScript, JavaScript, CSS, and a Rust CLI, and it has been fork-bombed
into Cursor, Windsurf, Trae, and a half-dozen other AI-IDE derivatives.
It is the natural successor to our Kubernetes-scale stress test: same
"industrial codebase" target, but in TypeScript, with a different sink
surface, and with a known-hard mix of security-relevant patterns
(inter-process renderers, shell-command spawns, fetch URL plumbing,
WebView innerHTML sinks).
We did not run this scan to find vulnerabilities. Microsoft has a dedicated security team, an active bug bounty, and at least four other static-analysis tools running in CI. We ran it to test what ArgosBrain is actually for: separating "scary candidates" from "actually reachable from untrusted input" at industrial scale, in seconds, for cents.
The corpus
Numbers, all reproducible by ingesting the same commit:
- github.com/microsoft/vscode @
1fa1b7af5c190606cdd5e8fe5e5f1ca4fad47e00(main, 2026-04-25) - ~12,000 source files (TypeScript dominant, with CSS, Rust in
cli/, JS in renderer prelude) - 151,620 symbols ingested into the brain
- 25 sink categories scanned in this pass: SSRF, XSS, SQLi, RCE, command injection, path traversal, deserialisation, XXE, LDAP, open redirect, buffer overflow, regex DoS, timing attack, crypto IV reuse, weak crypto, hardcoded secret, cloud API key, private key block, prototype pollution, TLS verification disabled, insecure random, unsafe Rust, CORS wildcard, cookie insecure, JWT none
The result, in one chart
The "high" bucket comes from heuristic exploit-score signals (caller fanout, argument shape) without a proven untrusted source. The reachability pass then walks the call graph from each sink up to depth 8, looking for tainted source markers (HTTP handler input, CLI args, file content, env var read). Of the ~340 sinks scanned across all categories with hits, zero were structurally reachable from such a source within depth 8.
Where the 156 "high" came from (the noise breakdown)
The honest part of running a security tool is showing the false-positive profile. Almost none of these 156 are actionable. The triage report classified them at the time of scan:
| Category | High | Why "high" but not actionable |
|---|---|---|
| Insecure random | 58 | Math.random() in tests, animations, sampling — not used for tokens, sessions, or keys |
| SSRF | 42 | fetch() with hardcoded service endpoints, mostly in proprietary Copilot extension (disclosed privately to MSRC, see below) |
| XSS | 19 | .innerHTML in trusted-source renderers (notebook cell output, ghost-text). Includes a hit on the bundled dompurify.js file itself — the sanitiser, not a sink |
| Cloud API key | 16 | Mailgun-style key- prefix matched non-secret identifiers (secretStorageKeyPath, settings keys, charCode enums). The ghp_…, AKIA…, sk-ant-… hits were all in secretFilter.spec.ts — intentional fixtures for the secret-redaction filter test |
| Unsafe Rust | 16 | All in cli/src/ — Win32 API bindings, glibc version probe, file metadata. unsafe is required for FFI, and each block has a SAFETY: comment |
| Prototype pollution | 2 | __proto__ references in config merge() and debug glue — likely guards rather than write paths |
| TLS verification disabled | 1 | rejectUnauthorized: false in a proxy debug helper (Copilot extension) |
| Hardcoded secret | 1 | accessToken: 'gho_mock_e2e_test_token_…' in an end-to-end mock auth fixture, explicitly named e2e-mock |
| Weak crypto | 1 | MD5/SHA1 in an external ingest client — almost certainly content-addressing, not auth |
| TOTAL | 156 | 0 confirmed reachable from untrusted input |
The triage value is the second column. A naïve grep -rn 'innerHTML'
surfaces every one of the 19 XSS hits flat, including the sanitiser file.
ArgosBrain emits the same finding but adds the structural context that
lets a human (or an agent) say "DOMPurify is a sanitiser callsite, not a
sink — discard". Same for the secret-filter test fixtures and the FFI
unsafe blocks.
The three MIT-core candidates worth a look
Three findings landed in src/ (MIT-licensed VS Code core).
Reachability says zero of them have an untrusted-source path within
depth 8, but the patterns are interesting enough that we publish them
here for the next person doing a manual review. None of these are
vulnerability claims — they are starting points for a human reviewer
who can confirm taint flow with field-level analysis tools (Semgrep
Pro, CodeQL).
| Pattern | File / range | Why ArgosBrain flagged it |
|---|---|---|
| SSRF candidate | extHostMcp.ts:331-878McpHTTPHandle |
Outbound HTTP wrapper. URL host is parameterised; SSRF risk only if host is user-controllable across a process boundary. Worth verifying the input chain. |
| XSS candidate | webviewPreloads.ts notebook renderer prelude |
3,000+ lines of WebView prelude with multiple .innerHTML sites. Notebook output is normally trusted (kernel-controlled), but the surface is large enough to merit a sanitisation audit. |
| Prototype pollution | configuration.ts:337-351 config merge() |
__proto__ referenced inside a recursive merge. If the function does not reject __proto__ as a key, untrusted JSON config could mutate Object.prototype. Worth a unit test that asserts the guard. |
Each link points at the exact commit so the line numbers stay valid even
as the file evolves on main. Microsoft's CodeQL pipeline almost
certainly already covers these patterns; the value of running ArgosBrain
separately is the speed of triage when an outsider wants a
same-day answer to "is this codebase critically vulnerable?" without
paying for a SAST seat.
What we are not publishing — the responsible-disclosure boundary
Five named high-severity surfaces in the full report were inside
extensions/copilot/* (three SSRF candidates in HTTP API
clients, one TLS-verification-disabled in a logging proxy helper, one
weak-crypto candidate in a workspace-search ingest client). That code
ships with the proprietary GitHub Copilot extension, not with VS Code
OSS — it is Microsoft IP under a separate licence.
Publishing file:line excerpts from proprietary code, even when our findings are candidates rather than confirmed exploits, is the kind of move that gets a security writeup pulled. So we did the responsible thing:
- The five Copilot-extension findings are being submitted privately to the Microsoft Security Response Center.
- The aggregate metrics in this writeup (151,620 symbols, 156 high, $0.30, 8 seconds) include those findings — they are part of the scan.
- We do not name the files, do not link them on GitHub, and do not excerpt them. If MSRC concludes any of them are real, Microsoft will publish; we will not pre-empt that.
This is the boundary that distinguishes "marketing-with-a-fig-leaf" from "responsible engineering writeup". ArgosBrain's strongest pitch is "we make security review fast and cheap", and that pitch only survives if the tool's operators behave like a reviewer, not a pentester-on-a-clout-arc.
Cost and wall-clock
The cost story matters because the alternative on a 12,000-file codebase is not actually free. A naïve approach — grep for every dangerous pattern, dump the matched files into an LLM, ask it to triage — burns 6-12 million tokens for the kind of sweep ArgosBrain does in 95,000 tokens. At Opus 4.7's prompt rate, that is the difference between thirty cents and forty dollars. Multiplied by every CI run, every PR, every release branch, the gap compounds.
Reproduce it yourself
# 1. Install ArgosBrain (Free tier, no credit card)
curl -fsSL https://argosbrain.com/install | sh
# 2. Clone VS Code at the same commit
git clone https://github.com/microsoft/vscode.git
cd vscode
git checkout 1fa1b7af5c190606cdd5e8fe5e5f1ca4fad47e00
# 3. Initialise ArgosBrain on this project
argosbrain init
# 4. From your Cursor / Claude Code chat, run:
/argos-security-reviewer
# Expected: the 0/156/180+ severity buckets, ±2% small variance.
# Wall-clock 8-12 s on M-series laptops. Cost ~$0.30 in tokens.
What this scan does and does not measure
The structural reachability pass measures call-graph paths from sources (HTTP handler input, CLI argv, file content, env vars) to sinks (the dangerous patterns enumerated in the corpus section). It does not measure:
- Field-level taint flow. A reachable call graph does not prove tainted data flows along it — a sanitiser on the path may neutralise the risk. Pair with Semgrep Pro or CodeQL for that.
- Dynamic dispatch / reflection. Code paths that
resolve at runtime (DI containers,
eval-constructed callables, monkey-patched modules) are invisible to the static call graph. - Cross-process trust boundaries. VS Code's renderer / extension host / main process split is a real authorisation boundary; the scan treats it as one graph.
- Confirmed exploits. Every "high" in this report is a candidate. The value is the speed of narrowing the candidates, not declaring real CVEs.
Try it
One command, sixty seconds:
curl -fsSL https://argosbrain.com/install | sh
Free tier ships with every retrieval feature, every sink scanner, and every skill in the catalogue — for one active project at a time, no node cap. Upgrade to Pro ($19/month) for unlimited active projects. Pricing on the homepage.
Authors: ArgosBrain Team ·
Date: 2026-04-27 ·
License: CC BY 4.0 ·
Corpus: microsoft/vscode @ 1fa1b7a (MIT) ·
Disclosure: findings inside extensions/copilot/* submitted privately to Microsoft Security Response Center