The Backtest Illusion: How AI Stock Platforms Manufacture 300% Returns

Any AI can beat the market from 2017-2021. You just have to build it in 2021.

This isn't a bug in quantitative finance — it's a feature of how backtesting works. And it's being used to sell you something.

Manufactured Chance

Backtesting is a form of manufactured chance — creating the appearance of predictive skill where none exists. Modern systems exploit our predictive circuits by manufacturing randomness that feels informative but isn't.

The Core Problem

When a platform shows you "historical performance," ask one question: When did the model actually start making predictions?Everything before that date is backtesting — the model "predicting" data it was trained on.

See how we do it differently

We don't claim to beat the market. We show you the analysis — including where our models disagree.

View our analyses

The Anatomy of a Backtest

Backtesting sounds reasonable: take your strategy, run it against historical data, see how it would have performed. The problem is that when you build the model, you already know what happened.

The data scientist sees 2020. They know which stocks crashed in March and which recovered by December. They can (consciously or not) tune their model until it "discovers" patterns that happened to work during that specific period. This is called overfitting — the model learns the noise of the past, not the signal of the future.

Live performance — where the model makes predictions about data it hasn't seen — is the only honest test. It's also almost always worse than the backtest.

Why Backtests Always Look Good

If a backtest looked bad, the model would be tuned until it looked good. You never see the backtests that failed — only the survivors. This is selection bias layered on top of overfitting.

Case Study: Danelfin

Danelfin is a real company, not a scam. Founded by Tomás Diago (who previously founded Softonic), backed by Nauta Capital, with a legitimate product. We want to be clear about that upfront.

Their marketing prominently features claims like:

"Since 2017, our AI Score 10/10 stocks beat the market by +20%."

Here's the thing: Danelfin launched in 2021.

That means four years of that performance (2017-2021) is backtesting — the model "predicting" outcomes that already happened. Only the period from 2021 onward represents actual live predictions. The marketing chart seamlessly blends these two periods with no visual distinction.

Platform launches (2021)

BACKTEST

LIVE

20172021Now

Model didn't exist yet

Actual predictions

When someone shows you returns "since 2017" for a platform that launched in 2021, this is what's actually happening. The first four years are fantasy.

This isn't illegal. It's not even technically dishonest — they're not claiming it was live. But it's designed to create an impression of years of proven performance when the actual track record is much shorter.

What we do differently

We don't backtest. We don't claim to beat the market. Our LLMs read documents and surface patterns in language — they don't predict prices. There's no historical performance to show because we're not making predictions you can score.

The Supporting Cast: Other Red Flags

Backtest inflation rarely travels alone. Here are the patterns that often accompany it:

Survivorship Bias

Signals that went wrong quietly exit the track record. If a "buy" signal turns into a 50% loss, it gets rebalanced out or reclassified. The winners stay; the losers disappear. You only see what survived.

Cherry-Picked Windows

"60-day alpha" gets highlighted because 90-day looks worse. The marketing team found the time window where the numbers look best and featured that one. If they had a better window, they'd show you that instead.

Feature Bloat

"We analyze 10,000 features!" sounds impressive. In practice, more variables often means more overfitting. A simple model with 5 key variables frequently outperforms a model drowning in 10,000 noise signals. This is marketing, not methodology.

Confidence Theater

"87.3% accuracy!" The precise decimal creates an impression of scientific rigor. But accuracy at what? Over what period? Predicting direction or magnitude? The precision distracts from the vagueness of what's actually being measured.

No Drawdown Disclosure

Returns without risk metrics are meaningless. If a strategy beat the market by 20% but had a 60% drawdown along the way, would you have held through it? Returns tell you the destination; drawdowns tell you the journey.

What Danelfin Actually Does Well

We're not here to bury them. Danelfin has legitimate strengths:

Explainable AI: They use decision trees that show their reasoning — "RSI is low" + "revenue growth is high." This is genuinely better than black-box neural nets that just say "trust me."
Real product, real funding: Backed by Nauta Capital, founded by an experienced entrepreneur. Not a fly-by-night operation.
Works as a screener: User consensus (Reddit, reviews) suggests it's useful for surfacing ideas you wouldn't have found otherwise.

The Real Assessment

The product is real. The marketing is aspirational. Most retail investors won't know the difference between backtested fantasy and live reality. That's the business model.

What we do differently

We show you when our models disagree. Danelfin shows you confidence scores. One of these helps you understand uncertainty; the other hides it.

How to Evaluate Any AI Platform

Use this checklist when evaluating any platform that claims AI-powered investment insights:

Question to Ask	Red Flag	What We Do
When did the platform launch?	Claims before launch date are backtest	We don't claim historical returns
Can you see live-only performance?	If they won't separate it, there's a reason	We don't make performance claims
Do they show drawdowns?	Returns without worst periods are cherry-picked	We show model disagreements
How specific are the claims?	"Beat the market" is vague	We classify, not predict
What are users doing with it?	"Trade signals" = dangerous	Research acceleration

Why This Industry Works This Way

Platforms like Danelfin exist because retail investors want "the answer." They want someone — or something — to tell them what to buy. That's a human need, and it creates a market.

Backtested returns feel like evidence because we're trained to trust charts. A line going up and to the right triggers the same pattern-recognition circuits regardless of whether it represents real predictions or retrofitted fantasy.

The platforms aren't lying — they're just showing you what sells. The interesting question is: what would it look like if they were honest?

What we do differently

We built our platform to show uncertainty, not hide it. When our models disagree, we show you the disagreement. When we don't know something, we say so. This is less marketable than "87.3% accuracy since 2017" — but it's more useful.

Honesty Over Marketing

We're in this industry too. We could show you backtested performance of our frameworks. We choose not to, because it wouldn't mean anything.

The Takeaway

When someone shows you a chart of AI performance since 2017, ask one question: When did the model actually start running?

If the answer is 2021, you're looking at four years of fantasy and three years of reality. The fantasy part always looks better.

We don't know if Danelfin's live performance is good or bad — we haven't tracked it long enough. What we do know is that their marketing makes it very hard to tell. And that's the point.

See how we approach analysis

Explore our equity analyses to see how we show uncertainty, model disagreements, and cross-lens conflicts.

Browse our analyses

The Backtest Illusion: How AI Stock Platforms Manufacture 300% Returns

The Anatomy of a Backtest

Case Study: Danelfin

The Supporting Cast: Other Red Flags

Survivorship Bias

Cherry-Picked Windows

Feature Bloat

Confidence Theater

No Drawdown Disclosure

What Danelfin Actually Does Well

How to Evaluate Any AI Platform

Why we don't backtest

Why This Industry Works This Way

The Takeaway

Further Reading

How Frameworks Can Hide Hallucinations

A Taxonomy of LLM Hallucinations

Why We Use Multiple LLMs