Skip to main content
← Back to BlogTech Insights

Can AI Play Chess?

A simple question that cuts through the AI hype.

By Evanuel Rodrigues2 min read

Can AI Play Chess?

A simple question that cuts through the AI hype.

Short answer: No. Longer answer: Not like you think.

The Evidence

July 2025: Magnus Carlsen plays ChatGPT online, wins in 53 moves without losing a single piece. ChatGPT even pegs his strength around 1800–2000—well below his real rating.

August 2025: Early matches at Kaggle Game Arena's AI chess exhibition show LLMs advancing or losing on illegal moves; several quarterfinals ended 4–0 sweeps, and models can forfeit after four illegal attempts.

Illegal-move rates: In controlled tests, GPT-4o made illegal moves ~12.7% of the time. Gemini has also been observed making ~50% legal picks in a head-to-head test vs ChatGPT.

Yes, scaffolding (e.g., feeding the list of legal moves) can rein in errors—but that's extra engineering, not magic.

The Revelation

Chess: 64 squares, fixed rules. Your business: infinite variables, shifting constraints, human behavior.

If LLMs struggle to track state and rules in chess, expect worse on messy, evolving business logic.

The Business Truth

AI excels at pattern matching: code completion, syntax checks, boilerplate. AI fails at judgment: context, strategy, trade-offs, "when to bend or break rules."

One accelerates developers. The other replaces them. We're nowhere near the latter.

The Market Reality

"AI-powered development" usually means skilled devs using tools like Copilot or advanced LLMs with guardrails. The winners hire experienced engineers who wield AI well. The losers pretend AI can replace human judgment.

Your Choice

A) Believe the hype. Hire cheap. Watch "perfect" code compile and fail in production.

B) Hire experienced developers who use AI as a force multiplier.

Chess makes it obvious. The evidence is public. Choose wisely.


Sources

• Carlsen vs ChatGPT (53 moves, underrating) — Time, Times of India, VnExpress

• Kaggle Game Arena AI Chess (event format & early results) — Chess.com, Times of India

• Illegal-move rates — The Register (GPT-4o ~12.7%); Chess.com (Gemini ~50% legal picks in test)

• Prompting to reduce illegal moves — Medium (Gemini 2.5 Pro)


Build with developers who understand both AI and its limits.

Tags:#AI#chess#reality#business#SoftwareEngineering#RealityCheck
ER

Evanuel Rodrigues

Founder & Technical Director

Engineering background from Darlington. Building MVPs for founders who lack technical co-founders. Committed to delivering exceptional digital products with precision and speed.

Ready to Build Your MVP?

8 weeks. £10,000. Full code ownership.

Book Free Discovery Call