The Fugazi Filter: A Dual-Axis Test for Financial Integrity

Huh? Fugazi?

"Fugazi" is slang for fake, not real, or worthless. The term was popularized in finance circles by a memorable scene in The Wolf of Wall Street where it's used to describe something that doesn't actually exist. Watch the clip (NSFW — clip from The Wolf of Wall Street)

Standard equity analysis assumes you can trust the numbers. You debate valuation, growth rates, competitive position — but you're all working from the same financial statements, and you trust they reflect reality.

What if you can't?

What if "adjusted EBITDA" is adjusted to hide structural losses? What if related-party transactions are propping up revenue? What if the auditor is getting nervous but hasn't resigned yet? What if the company's survival depends on perpetual market access that could evaporate next quarter?

The Core Question

Can we trust this company's reported numbers, and can it survive a stress event — or is it vulnerable to sudden repricing due to accounting, financing, or governance problems the market hasn't fully priced?

This question is epistemologically prior to valuation. If you can't trust the numbers, any DCF built on them is meaningless.

Why This Isn't Just "Short-Seller Research"

There's a genre of financial content that could be called "fraud porn" — breathless takedowns that assume every accounting complexity is evidence of malfeasance. That's not what we're doing.

But there's also a failure mode in standard analysis: treating SEC filings as gospel because "they're audited" and "there are laws." Both of these are true, and also insufficient. Enron was audited. Wirecard was audited. Auditors catch most things; they don't catch everything.

The Fog of Complexity

The companies most likely to be hiding something are also the ones with the most complex structures — securitizations, VIEs, related-party financing, multiple non-GAAP adjustments. Complexity isn't proof of fraud, but it's the environment where fraud survives longest.

The Dual-Axis Innovation

Early versions of this framework made a mistake: they conflated two different problems.

Integrity Risk

Can we trust the reported numbers? Is the company misrepresenting its economics?

Fragility Risk

Can the company survive stress? Could a funding freeze or covenant breach cause collapse?

A transparent but levered company is not the same as a suspected fraud on solid funding. The action implications are completely different:

Low Integrity Risk × High Fragility

Distressed value candidate — honest but stressed

High Integrity Risk × Low Fragility

Dangerous — fraud on solid funding can persist for years

Low × Low

Standard due diligence sufficient

High × High

Maximum concern — avoid entirely

This is why the framework outputs two risk axes, not one. The combination tells you what you're dealing with.

The Evidence Ladder

One of our biggest design problems was: when do concerns become a pattern?

Early versions required "hard signals" — restatements, auditor resignations, SEC actions. The problem? By the time you have those, you've missed most of the move. The framework was theoretically useful but practically late.

The Evidence Ladder solves this with graduated evidence levels:

E0Heuristic

Ratios and thresholds only. No documentary support. Can't trigger a pattern by itself.

E1Documentary

Footnote inconsistencies, metric definition changes, unexplained disclosures. Can trigger pattern with 2+ dimensions.

E2External

Third-party divergence — web traffic data, customer complaints, lender filings that don't match company claims. Can trigger pattern with 1+ dimension.

E3Official

Restatement, auditor resignation, SEC action, court finding. Single E3 signal is significant on its own.

The Key Discipline

E1 and E2 evidence can trigger a pattern — you don't have to wait for E3 official events. But you need documentary or external support, not just suspicious ratios. This enables early detection while maintaining rigor.

The Analysis Structure

The framework runs through a structured sequence. Here's the short version:

Tier 0: Killer Signal Screen

Binary checks before analysis. Auditor resigned? Going concern? Late filings? These create classification floors — if the auditor just resigned, no amount of detailed analysis should produce "Low Risk."

Stage 0: Eligibility & Allegation Mapping

Confirm framework fit. Map any existing allegations to testable claims. Identify complex structures that trigger specialized modules.

Stage 1: Forensic Accounting Scan

Does cash track earnings? Is revenue recognition aggressive? Are non-GAAP adjustments large and structural?

Stage 2: Balance Sheet & Funding

Hidden leverage? Restricted cash? Covenant pressure? How many quarters of runway at current burn?

Stage 3: Governance & Incentives

Auditor signals, insider behavior, related-party exposure. How does management respond to critics?

Checkpoint A: Pattern Coherence

Do the signals form a connected story? What's the highest evidence level achieved? Is there a plausible mechanism?

Stage 4: Devil's Advocate

Force the bull case. This stage has binding impact — a strong bull case caps Integrity Risk at Elevated.

Stage 5: Dual-Axis Classification

Synthesize into final output: Integrity Risk × Fragility Risk, with optional pricing assessment.

The Classification Output

The framework outputs two risk axes, each with four levels:

Integrity Risk

Can we trust the numbers?

Low

Forensic checks clean

Elevated

Yellow flags, no pattern

High

Coherent pattern

Probable Misrep

Multiple evidence lines

Fragility Risk

Can it survive stress?

Low

Solid funding

Elevated

Credible refi path

High

Significant stress

Acute

Near-term survival risk

Each classification comes with explicit falsifiers — specific conditions that would prove the thesis wrong. If you can't articulate what would change your mind, you probably haven't thought hard enough.

How We Built It

We ran this through our AI ensemble process: multiple personas with different cognitive styles generating independent proposals, then critiquing each other, then synthesizing and repeating.

Three rounds of iteration taught us things:

Over-engineering is the default failure mode. Every model's initial proposal was too complex — too many stages, scores, gates. We kept cutting until the framework was actually usable.

LLMs design prompts requiring calculations they can't do. Early versions asked the model to "calculate Beneish M-Score" or "compute CFO/NI ratio over 5 years." These are calculation tasks LLMs will hallucinate rather than refuse. Everything is qualitative now.

Numeric thresholds create false precision. "CFO less than 50% of NI" sounds rigorous but is an arbitrary threshold that will be wrong half the time. We switched to qualitative assessments with integrated exceptions.

The best critique came from Gemini:

"The framework is conceptually 9/10 but operationally 4/10. It asks the LLM to act as a quant rather than a forensic linguist."

That was the insight that led to v1.1. LLMs are excellent at pattern recognition, textual analysis, synthesizing qualitative signals. They're terrible at precise calculations from noisy documents. The framework now leans into what they're good at.

The Meta-Learning

Building analysis frameworks with ensemble critique catches failure modes no single perspective would surface. Different personas caught different gaps — the Evidence Ladder, the calculation problem, the need for Tier 0 blocking checks. The process works.

What This Framework Won't Do

We're trying to be honest about limitations:

It won't prove fraud. The framework surfaces risk signals, not proof. Even "Probable Misrepresentation" is an assessment of risk, not an allegation of wrongdoing.
It won't predict timing. A high-risk classification might collapse in 3 months or persist for 3 years. We identify risk, not when it materializes.
It won't work for banks or biotech. Banks have different fraud/solvency anatomy. Pre-revenue biotech is about clinical science, not forensic accounting.
Thresholds are heuristic. We'll calibrate after running 10+ analyses with outcome tracking. The validation infrastructure exists; the data doesn't yet.

What's Next

We're running v1.1 on a pilot batch of equities — including Carvana, which was the case that motivated building this in the first place. We'll track:

Whether the dual-axis output produces defensible distinctions
Where the LLM struggles or improvises
What signals get missed or over-weighted
Whether multiple models converge on classifications

After 10 analyses with outcome tracking, we'll publish v1.2 with calibrated thresholds and whatever fixes emerge from real-world usage.

The Goal

We don't know if any given company is a fraud. Neither does anyone else looking only at public data. But systematically asking the right questions, documenting reasoning, and defining what would prove us wrong — that's more likely to help than trusting vibes or narrative momentum.

The Fugazi Filter is now available

We run companies with fraud signals or complex structures through this framework. Analyses will appear in our research section as they're completed.

Explore Equity Research →

The Fugazi Filter: When You Can't Trust the Numbers

Why This Isn't Just "Short-Seller Research"

The Dual-Axis Innovation

Integrity Risk

Fragility Risk

Where does this fit with our other frameworks?

The Evidence Ladder

The Analysis Structure

Tier 0: Killer Signal Screen

Stage 0: Eligibility & Allegation Mapping

Stage 1: Forensic Accounting Scan

Stage 2: Balance Sheet & Funding

Stage 3: Governance & Incentives

Checkpoint A: Pattern Coherence

Stage 4: Devil's Advocate

Stage 5: Dual-Axis Classification

Why does Devil's Advocate have 'binding impact'?

The Classification Output

Integrity Risk

Fragility Risk

How We Built It

What This Framework Won't Do

What's Next

The Fugazi Filter is now available

Further Reading

Fugazi Filter — Full Reference

The Prospectus Probe

Why We Use Multiple LLMs

A Taxonomy of LLM Hallucinations