Open Research Project

Don't trust the answer.
Verify the failure.

VerifiQuant is a 6-layer diagnostic pipeline for financial computations. Instead of chasing higher LLM accuracy, it delivers predictable, verifiable failure behavior — intercepting broken inferences at the right layer and applying mathematically verified repairs.

The M/N/F/E/I/C Funnel

Six layers of defense. Each gate intercepts a specific class of failure before it reaches the user.

M

Mismatch

No relevant Financial Inference Contract found. The question is out of scope — refuse cleanly rather than hallucinate.

N

Novel / OOD

Question is partially matched but outside the trained distribution. Flag uncertainty instead of guessing.

F

Field Missing

Required inputs are absent. Ask the user to provide them rather than inventing values.

E

Execution Error

Code runs but produces invalid results — scale violations, NaN, infinity. Deterministic checks catch what LLMs miss.

I

Interpretation Ambiguity

Multiple valid interpretations exist (e.g., annuity-due vs. ordinary). Surface the choice with verifiable transform specs.

C

Correct

All gates passed. The computation is verified against invariants and delivered with full provenance.

Benchmark Results

50-question medium-difficulty set, Gemini 2.5 Flash, April 2025

System Accuracy Notes
JPMorgan Multi-Agent (paper) 0.46 Pass@1 Published baseline
CoT Self-Improve (no GT) 0.90 (45/50) Chain-of-thought with self-correction
VerifiQuant Framework-Guided 0.88 (44/50) 28% recovery rate on initially failed cases

VerifiQuant's advantage is in failure-mode behavior on trap and incomplete questions, not clean-question accuracy. The system knows when and why it fails.

See it in action

Submit a financial question and watch the 6-layer funnel classify, diagnose, and repair in real time.

Open Demo Console