Open Research Project

Don't trust the answer.
Verify the failure.

VerifiQuant is a 6-layer diagnostic pipeline for financial computations. Instead of chasing higher LLM accuracy, it delivers predictable, verifiable failure behavior — intercepting broken inferences at the right layer and applying mathematically verified repairs.

Try the Demo GitHub ↗

The M/N/F/E/I/C Funnel

Six layers of defense. Each gate intercepts a specific class of failure before it reaches the user.

Mismatch

No relevant Financial Inference Contract found. The question is out of scope — refuse cleanly rather than hallucinate.

Novel / OOD

Question is partially matched but outside the trained distribution. Flag uncertainty instead of guessing.

Field Missing

Required inputs are absent. Ask the user to provide them rather than inventing values.

Execution Error

Code runs but produces invalid results — scale violations, NaN, infinity. Deterministic checks catch what LLMs miss.

Interpretation Ambiguity

Multiple valid interpretations exist (e.g., annuity-due vs. ordinary). Surface the choice with verifiable transform specs.

Correct

All gates passed. The computation is verified against invariants and delivered with full provenance.

Benchmark Results

50-question medium-difficulty set, Gemini 2.5 Flash, April 2025

System	Accuracy	Notes
JPMorgan Multi-Agent (paper)	0.46 Pass@1	Published baseline
CoT Self-Improve (no GT)	0.90 (45/50)	Chain-of-thought with self-correction
VerifiQuant Framework-Guided	0.88 (44/50)	28% recovery rate on initially failed cases

VerifiQuant's advantage is in failure-mode behavior on trap and incomplete questions, not clean-question accuracy. The system knows when and why it fails.

See it in action

Submit a financial question and watch the 6-layer funnel classify, diagnose, and repair in real time.

Open Demo Console

Don't trust the answer. Verify the failure.