Papers

Three papers. Two benchmarks. One architecture.

We publish our work in the open so you can verify the numbers on the site without taking our word for it. Every paper here is CC BY 4.0 and the LongMemCode benchmark is MIT. The engine itself is commercial — but adapter stubs for competitors are in the benchmark repo, so anyone can reproduce the headline numbers. If something is wrong, the data is on GitHub for you to prove it.

01The collection

Read the foundation first.

Paper 1 specifies the LongMemCode benchmark and reports baselines — it is the empirical anchor the other two papers cite. Paper 2 argues the structural-versus-semantic split as a design principle. Paper 3 describes Neurogenesis, the graph-first engine behind ArgosBrain, in sufficient detail to reproduce the retrieval behaviour.

Paper 1 · cs.SE / cs.AI

LongMemCode: A Deterministic Benchmark for Code-Memory in AI Agents

We introduce LongMemCode, a public benchmark for evaluating the retrieval component of memory systems used by AI coding agents. Existing benchmarks measure either conversational long-term memory (LongMemEval, LoCoMo) or end-to-end agent task success (SWE-bench); none isolates the retrieval quality, speed, and compression of a memory system at coding-agent workloads.

Read the paper →
Paper 2 · cs.SE / cs.IR

Structural vs Semantic Retrieval in Code-Memory: A Query-Type Taxonomy

We propose a taxonomy of retrieval queries for AI coding agents and argue that code-memory systems require separate treatment of two query classes rather than a unified retrieval layer. Structural queries admit exact answers derivable from a semantic graph of canonical identifiers; semantic queries are best served by vector retrieval over embedded code chunks.

Read the paper →
Paper 3 · cs.SE / cs.PL

Zero-Cost Graph Retrieval at Compiler-Grade Depth for AI Coding Agents

We describe Neurogenesis, a graph-first code-memory engine that answers structural retrieval queries for AI coding agents without any LLM call on the read path. The engine ingests source code into a canonical-identifier graph via a tiered pipeline that selects the highest-precision indexing technology available per language.

Read the paper →

Cite. All three are pre-print (April 2026). Use the arXiv identifiers once they are assigned; until then, cite the title, the author (Aurelian Jibleanu, Neurogenesis), the year (2026), and the canonical URL on this site. A BibTeX block will be published alongside each paper once the arXiv IDs land.

02Reproduce

The benchmark is MIT. The engine is commercial.

Clone LongMemCode, plug in your own adapter, and you will reproduce the baseline and grep numbers in Paper 1 on your own laptop. The structural reference adapter points to a running ArgosBrain instance — the binary is ours, the protocol and the benchmark are open.