Skip to content

The Pattern Matching Revolution: Why 95% of AI Projects Fail (And How to Be in the 5%)

Published: August 25, 2025
Author: Daniel Shanklin, CEO & Founder, Rhea AI Inc
Read Time: 7 minutes


TL;DR

  • MIT research shows 95% of AI projects fail - but not for the reasons you think
  • LLMs are sophisticated pattern matchers, not human-like thinkers
  • A simple shift in perspective can increase AI success rates from 71% to 95%
  • We're entering the "Trough of Disillusionment," but AGI timelines are accelerating
  • Focus on classification tasks where AI excels, not open-ended problem solving

The 95% Problem

The world seems somewhat split on whether AI is a savior or the devil.

MIT recently published that 95% of AI projects fail. We're not surprised in the least. But the reason for these failures might not be what you think.

The Healthcare Breakthrough: A Real-World Case Study

The best example I can think of comes from an experience I had a little over two years ago, with some early LLMs.

While working in healthcare, I was tasked with helping a data science team improve a machine learning model. The challenge: Could we enhance a computer's ability to tag (classify) line-items in a General Ledger, and could we do it at scale for 65% of all US hospitals?

When I investigated the task further, I noticed three critical problems with their existing approach:

The Three Fatal Flaws

  1. Monolithic Model Architecture: The team had neglected to split up the problem into sub-models, and instead relied on one massive model to classify all expenses across all hospital departments. No domain expertise could be applied to different departments within a hospital, or to specific hospital types and contexts.

  2. Disconnected from Ground Truth: The team had not yet connected the GL line-item to the ground-truth data - the actual payables and receivables associated with those line items. They were trying to classify "Bob's Services" without knowing what Bob actually provided.

  3. Legacy Technology Stack: The team had not yet explored LLMs, instead relying only on traditional Machine Learning approaches that required extensive feature engineering and training cycles.

The 48-Hour Revolution

Within 48 hours, and with just 400 lines of code, I built an LLM-based classifier that increased accuracy from 71% to 95%. Here's exactly how this dramatic improvement was possible:

1. Ground Truth Pattern Recognition: By feeding the actual payables and receivables data into the AI, it could suddenly see patterns that traditional ML completely missed.

Real Example: If a GL line item said "Bob's Services" (meaningless for classification), but the line-items on the invoice payable said "Industrial Washing Equipment Rental" and "Linen Processing," the AI immediately understood the true nature of the expense: Laundry Services. Without that contextual data, "Bob's Services" was completely insufficient for accurate classification.

2. Conversational Classification: Through dynamic prompt tuning and multi-shot learning, the AI could progressively refine classifications through an iterative conversation rather than making a single, binary decision. It could ask itself: "This looks like facilities maintenance, but the invoice mentions medical equipment - let me reconsider this as Medical Equipment Services."

3. Domain-Specific Context: The AI could apply hospital-specific domain knowledge that traditional ML models struggled to encode. It understood that "Environmental Services" in a hospital context typically means housekeeping and sanitation, not landscaping.

These three approaches eliminated the need for a full year of traditional ML development - model training, feature engineering, hyperparameter tuning, and cross-validation cycles.

The Explainability Dilemma

The breakthrough created an unexpected organizational challenge: I couldn't "explain" exactly how the LLM arrived at its conclusions in the way that traditional ML models could show feature importance scores. Some data scientists within the team were reluctant to put it into production despite the 24-point accuracy improvement.

Their emphasis on explainability was completely understandable from a regulatory and audit perspective, but the performance gain was also undeniable. This tension between interpretability and performance became a defining theme in my production AI journey.

My real education in production-grade AI began in that moment, revealing the constant balance between speed and accuracy - forces that often work in opposition.

The Fundamental Misunderstanding: Pattern Matching vs. Human Thinking

The transformer architecture used in LLMs is the most sophisticated classification engine ever built. An LLM's ability to mimic human speech patterns has given rise to many AI pilots that stretch promises of its capabilities.

Simply put, many think that AI thinks like a human, when it simply does not.

Instead, we believe LLMs are the world's fastest and most sophisticated pattern matchers. When tasked with pattern matching, LLMs can perform incredible feats - like my healthcare example above. But understanding this distinction is critical to deployment success.

Why Companies Fail: The Wrong Mental Model

With so many companies attempting to roll out AI, why are so many still failing?

We believe the lack of technical understanding leads to AI pilots that are doomed from the start. Most companies fundamentally misunderstand what an LLM actually does.

The Kickball Team Analogy: How LLMs Really Work

An LLM is, by its very construction, a classifier. It takes in inputs, and then chooses the most likely word to be said next, repeating this process to construct sentences and paragraphs.

It chooses the next word from its dictionary of possible words it could say next. In essence, it's like lining up in the school yard and picking the best person for your kickball team, and repeating the process until you've constructed your team.

  • If you say "The capital of France is", the model is most likely to respond: "Paris"
  • If we turn the AI temperature gauge up (a real thing!), we can increase an AI's ability to "think outside the box" and randomly choose a different word next. It might instead say "The capital of France is beautiful this time of year"

The Fatal Assumption

No part of this design guarantees that AI could generate something new, novel, or even correct. And that's where most AI projects fall flat.

Asking an AI to be a stellar customer service agent would first require that the AI know what a customer service agent should say in each unique scenario. When trained on the whole of the internet, there are simply too many options it can pick from. The AI constructs the wrong kickball team by choosing the wrong words to say next.

Solutions That Actually Work

Some approaches, such as implementing RAG (Retrieval Augmented Generation - essentially giving AI access to a curated database), are forms of context engineering that hold real promise. It's like giving AI a specific playbook to follow, with proven examples to emulate.

AIC's Symmetra is our weekend warrior attempt to push these boundaries further, with advanced context engineering that provides any AI with a helpful shove in the right direction.

But simply asking an AI to "complete the task" without proper context and constraints is highly unlikely to ever succeed - and we believe that's what most companies are doing. Fail to plan, and you plan to fail!

The key takeaway from the last 2 years of AI isn't the failures or successes, but the progressive improvements happening behind the scenes while companies struggle with implementation.

Efficiency Gains That Defy Moore's Law

Google published research on August 21st noting that AI energy usage is down 33x from one year ago for the same query. That's a stunning 97% reduction in power usage, an increase in efficiency that far outpaces the doubling speed of transistors estimated by Moore's law (doubling compute efficiency every two years).

According to Jevons Paradox, this efficiency gain could actually increase GPU usage, not reduce it - much like how internet infrastructure expanded even as modems became far more efficient.

Breakthrough Capabilities

ChatGPT-5 recently authored a novel mathematical proof suggesting it can now reason like a PhD student and develop complex novel insights.

I continue to believe that most deployed agents in 2025 will fail due to poor memory management and non-existent cross-validation, but that doesn't diminish the incredible underlying progress.

The Gartner Hype Cycle: Where We Stand

When it comes to AI, we believe the world is most likely experiencing the beginning of the Trough of Disillusionment, something we've predicted would occur by 2026. The Gartner Hype Cycle informs us that major technological advancements almost always experience a pattern:

  1. Peak of Inflated ExpectationsWe were here in 2023
  2. Trough of DisillusionmentWe are here now
  3. Slope of EnlightenmentComing 2026-2027
  4. Plateau of ProductivityComing 2028+

Our specific thematic choices and mass avoidance of the word "AI" in our OurOtters.com marketing were a calculation on our part. During the Trough of Disillusionment, sell the opposite of futuristic AI: nostalgia!

AGI (Artificial General Intelligence) Predictions

AI Engineers that are deeply embedded with the system are not moving their AGI estimates backward; they're moving them forward. François Chollet, a noted AGI skeptic, recently moved his AGI timeline up from 10 years to just 5 years.

This, in spite of the general public's disillusionment with existing AI pilots and trials.

All of this points to, at least for me, a long-term trend in AI advancement that will continue its own hyperbolic rise.

My own position as an AI Engineer certainly biases my timeline, and I feel less than confident to make a strong exact-year prediction. But I am confident in one thing: Transformers (2017) will not be the last great innovation in AI, AGI will come about in our lifetime, and 5 years doesn't sound far-fetched. It's on pace with the progressive growth we're already seeing.

The Winning Strategy

By staying 90% focused on the core tasks that AI can solve confidently today (classification, data retrieval, and programming), and continuing to have 10% stretch goals and pilot trials (true reasoning and context engineering), I believe companies will find themselves much better positioned every year going forward.

As for investment in AI, most likely the current bubble will pop. Not because AI is not progressing, but because its true calling as a pattern matcher is not yet being fully answered or heeded in the market.

What should be left to win in the long run: - Hardware companies that focused on core infrastructure - Existing software businesses that did the early work to boost efficiency in classification tasks and engineering


Key Takeaways

  1. Understand What AI Actually Does: LLMs are pattern matchers, not human thinkers
  2. Focus on Classification Tasks: Where AI naturally excels
  3. Provide Rich Context: Ground truth data dramatically improves results
  4. Expect the Trough: We're entering disillusionment, but that's normal for major tech advances
  5. Prepare for AGI: Timeline accelerating to ~5 years according to experts

This post is part of our AI Pulse series, bringing you practical insights from the frontier of artificial intelligence development. For more analysis on AI trends and implementation strategies, explore our AI Pulse archive.