← Back to Derive from Data

Why Text-to-SQL Is Still Failing in the Enterprise (Despite Smarter Models)

Over the past year, Text-to-SQL has gone from a research curiosity to a genuine industry conversation. With the emergence of GPT-5, Claude Sonnet 4.5, Gemini 2, and other advanced models, the ability to translate natural language into SQL queries has become remarkably accurate, at least in demos and benchmarks.

So why are enterprises still struggling to adopt Text-to-SQL successfully?

Despite all the progress, most real-world deployments stumble. The issue isn't model capability anymore. It's ecosystem readiness. There are three main bottlenecks: semantic mapping, data model design, and context engineering.

1. Missing Semantic Layer: The Business-to-Data Bridge

Language models don't fail because they can't generate SQL. They fail because they don't understand what the business means.

Every organization speaks its own dialect of business language: customer, active account, churned user, net retention. Unless those terms are explicitly mapped to the underlying data model, even the most powerful LLM will misinterpret them.

A true semantic layer should:

Without this layer, "show me last quarter's active customers" can generate dozens of valid but wrong queries.
This isn't the model's fault. It's a missing layer of shared understanding.

2. Data Models Built for Machines, Not for Meaning

For decades, data models have been optimized for system performance or BI dashboards, not for language understanding.
Star schemas, OLAP cubes, and dimensional models are great for aggregations, but they aren't intuitive to an LLM trying to reason about relationships between concepts.

To unlock the potential of Text-to-SQL, data models must evolve:

In short: we optimized for queries, not for comprehension. That has to change.

3. Context Engineering: Managing What the Model Can See

Even the best models are bound by context limits.
When a user query requires scanning dozens of tables or gigabytes of metadata, the LLM simply can't fit all that context into its reasoning window. The result? Hallucinations, not because the model is dumb, but because it's blind to most of the data structure.

This is where context engineering becomes a discipline.
It's the implementer's job to:

Without careful management, even GPT-5 will hallucinate when the data landscape is too large to fit in context.

The Reality Check

So, Text-to-SQL isn't failing because models are bad. It's failing because enterprises aren't yet structured for it.
They lack:

Once those foundations are built, the technology is ready.
Benchmarks already show 85–90% accuracy on complex queries, but accuracy on paper means little without context, governance, and alignment with business language.

The Way Forward

To make Text-to-SQL truly enterprise-ready:

  1. Invest in semantic infrastructure. Treat it as the brain that connects business and data.
  2. Redesign data models for LLM consumption. Clarity beats complexity.
  3. Build a context pipeline. Let the system feed the model only what's relevant.

The models are already intelligent enough.
It's the environment around them that needs to evolve.


In short:
Text-to-SQL isn't a model problem anymore. It's an enterprise architecture problem.
Once we design for understanding, not just execution, natural-language analytics will finally deliver on its promise.