AI Magic vs. The Disappointment: Why AI Feels Both Revolutionary and Broken

June 21, 2025 - 7 minutes read - 1348 words

I’ve been watching discussions around AI systems since last winter, and I’ve noticed a divide in how people perceive their capabilities. The same technology that leaves some users genuinely amazed leaves others profoundly frustrated. What’s even more interesting, is that the frustrated ones are often the technically sophisticated; the amazed, the technically unsophisticated.

Token prediction can save lots of work – and that’s magical!
The inability to build reliable workflows feels like failure to the technically sophisticated

Understanding this divide helps explain why AI discussions often feel like people are talking past each other, and what it means for the future of these technologies.

Caveat: For the purposes of this work, I will not consider whether the answers provided via AI are worth their environmental cost or whether the data acquisition to train the model was done ethically. These are worthy questions, indeed and where laws were broken, these corporations need to be induced to settle / be fined as stipulated by law.

The Magic is Real

For many users, modern AI assistants feel magical. A powerful example is the use of intelligence in coding editors. Here’s a quick historical trace:

Pre-1989: Raw Text Editing
Code with reference books and case studies. Text editors knew nothing about programming languages—System.pri could only be completed by manually typing nt or ntln to get System.print or System.println.

Late 80s-2000s: Smart Completions
Reference materials became integrated into “IDEs.” Editors understood function signatures and could complete based on data types being passed. Documentation appeared as pop-up balloons.

2000s-2015: Task-Aware Editing
Editors like RubyMine or PyCharm understood developer intent. You could request “Move this function from class A to class B” and the editor would handle the move plus update all related connections and references.

2025: AI Agentic Workflows
Describe a capability in natural language, and AI generates multiple code artifacts to fulfill the entire user experience.

For the non-developer audience, we’re somewhere between “Likely completions” and “task aware completions” for natural language tasks. It’s dismissive, reductionist, and unempathetic to not see that there’s value for the lay person.

Transform handwritten notes from photos into editable text
Extract only the highlighted passages from lengthy documents
Calculate shopping lists from multiple recipes automatically
Get statistical concepts explained through voice commands
Translate complex phrases across languages and writing systems: “Write ‘My name is Alexandros’ in Latin, using the Greek form” becomes “Nomen mihi Alexandroi”

I have, personally, done all of these. Sure, I could have coded them, but I didn’t have to and as far as that last one goes, I would have needed to peruse multiple Google searches to check my idiomatic use.

Combinatorics, optical-character recognition, dead language declination this is magic. I’d argue that it’s only because developers have been exposed to task-aware editing that they are turning up their fox noses, sniffing, and uttering a condescending “meh.” That is, their disappointment is also real.

The Disappointment is Also Real

But there’s another side to the AI experience that’s equally valid: profound disappointment when these systems fail at seemingly basic tasks. As developers, we’re accustomed to thinking about taking real or imagined tasks, breaking them down into discrete actions, and then encoding those actions electronically. Having created such an action plan, anything “intelligent” should be able to execute a plan to help us out.

Yet it’s precisely at this point that many AIs start to stumble.

They lose track of context
They time out
Requests are processed slowly, etc.

At this point, many developers dis and dismiss the technology:

“See, it’s not really intelligent”
“Ugh! I have to give it the last output artifact because it forgot what we were doing! How is this a help!”
“Ugh! It made some optimization and thus is not doing some basic task at the level of a bright junior high student! That’s not intelligence!”

And, to be clear, I’ve definitely been white-hot with rage when ChatGPT has spectacularly failed in the middle of trying to accomplish a task. But that’s the thing: AIs not great at tasks.¹

The crux of the matter is this: developers and power users often approach AI with workflow expectations when they excel at “autocomplete on steroids.” We developers want to delegate multi-step processes, automate repetitive tasks, and build reliable pipelines that can run unsupervised. Here, AI’s limitations become glaringly apparent.

The fundamental issue isn’t knowledge - current AI systems often know more about specific domains than most human experts. Per my examples above, it could code Python well enough to extract hot pink highlights from a picture of a book page that was marked up. For me, that’s a multi-hour, multi-search research project. An AI laddered me up to application of said code near-instantly.

No, rather what’s causing AI’s to under-deliver from the offended developer perspective is execution persistence. AI can brilliantly solve individual steps but struggles to maintain context and state across multi-step processes.

Consider a simple workflow: “Process these images, extract highlighted text, and format the results into a structured document.” Any competent intern could handle this task reliably. But AI systems often get halfway through, lose context, and either abandon the task or start over with different assumptions.

Finding the Efficacious Solution Space

My own experience illustrates this divide perfectly. I use AI regularly for definite value-add activities:

Proofreading blog posts with Claude
Learning Common Lisp and the Lisp coding environment “Swank’s” key-bindings
Debugging FreeBSD hardware incompatibilities
Exploring “boot environments” as secure ways to upgrade my FreeBSD installs
Debugging NGinX configuration files

Having survived the forum era of IDE tooling (see above) – this is some real time savings! The magic is real.

But let’s ask a developer-oriented task of a coding AI: I had checked out the (seemingly-abandoned) NsCDE project. It has a lot of dependencies and I was trying to get a minimal case working, so I asked it to take a fresh copy of the source code and keep only the minimum number of files needed to run the application. GitHub copilot went to work and deleted all the files needed to build the application. I could now not build the thing in order to run it.

On top of that, I tried to roll back the change to my last-good state via git. But it had deleted the database of git revisions. This is the one that really…really…really…hurt. I lost all my work.

Disappointment? No. White-hot servers being offloaded into the ocean amid a hiss, sizzle, and electronic pop of rage.

Critics would be arguing more fairly were they suggesting that we bound AI’s efficacious solution space. Some tasks - explanatory, educational, iterative refinement, creative brainstorming - work brilliantly. Others - autonomous operations, multi-step workflows, tasks requiring perfect reliability - remain dangerous territory where AI’s limitations can cause catastrophic failures.

The magic and the disappointment aren’t contradictory - they’re describing different regions and expectations of this solution space.

What This Means Going Forward

Understanding the solution space has practical implications:

For Individuals

Recognize AI’s current strengths and limitations. Use it as a powerful assistant for tasks in its efficacious solution space, but maintain healthy skepticism about workflow automation until the technology matures.

For Organizations

Don’t dismiss AI because it fails at workflow tasks, but don’t bet critical processes on its current limitations either. The value is real, but it’s in assistance and augmentation rather than automation.

For Developers

The breakthrough won’t necessarily require artificial general intelligence. We need better solutions for maintaining context and state across multi-step processes - essentially, better workflow orchestration that leverages AI’s strengths while compensating for its weaknesses.

The Promise Remains

We’re in an interesting moment where AI has solved many of the hard problems (knowledge synthesis, natural language understanding, domain expertise) but still struggles with seemingly simple ones (maintaining context, reliable execution, basic workflow persistence).

Both the magic and the disappointment are real. Understanding this divide helps us use current AI more effectively while working toward systems that can bridge the gap between brilliant individual capabilities and reliable end-to-end execution.

The revolution isn’t stalled - it’s just more nuanced than either the pure enthusiasts or pure skeptics suggest.