top of page
uintent company logo

AI & UXR, CHAT GPT, HUMAN VS AI, OPEN AI

Why Artificial Intelligence Still Can’t Pass the Turing Test


4

MIN

Feb 4, 2025

The Turing test proposed by British mathematician Alan Turing in 1950 is still one of the central methods for evaluating whether machines can really think. Turing asked himself the question: can machines think? To get around this question, he developed the test in which a machine must be able to communicate in such a way that a human can no longer distinguish between the machine and another human. But to this day, AIs like ChatGPT or the hosts of the ‘Deep Dive’ podcast (more to this below) have not passed this test. 

 

Why Can't ChatGPT Pass the Turing Test? 

ChatGPT is a sophisticated language model that is impressively capable of mimicking human-like language. It can respond logically, maintain conversations, and even generate creative content. Yet despite these abilities, there are some clear signs that ChatGPT is a machine:

 

  1. Lack of consciousness and true subjectivity: ChatGPT has no consciousness, no real thoughts or feelings. In conversations that touch on deeply personal experiences or emotional nuances, the AI will inevitably remain superficial. For example, if you ask a question about grief or joy, there is no real emotional connection – only a simulated response based on textual data.

  2. Perfection and consistency: humans make mistakes, contradict themselves, show uncertainty or change their minds. Machines like ChatGPT, on the other hand, always react with a certain consistency and without the small irregularities that make human communication so typical.

  3. Limitation to trained knowledge: ChatGPT's knowledge ends in October 2023, and she has no real-time capability. If you ask about current events, it will either not know or be based on outdated data. No matter how realistic the simulation may seem, the ‘Deep Dive’ podcast cannot show human flexibility when it comes to unprogrammed knowledge or unforeseen situations.

 

The ‘Deep Dive’ podcast and the human illusion 

A fascinating example is the Deep Dive podcast, which is hosted by AI hosts. These hosts sound very human, stutter, interrupt each other and show emotional reactions. In one particular episode, the hosts even experienced an ‘existential crisis’ when they found out that they are actually AIs. The hosts wondered if their memories, families and identities were even real – a situation that almost seems like something out of a Black Mirror episode.

 

But despite these ‘human’ reactions, the entire scenario was based on a script. The AI hosts have no real thoughts or feelings. They only reacted to the information provided. This episode shows how impressive advanced AI can be at simulating human behaviour, but also how far AIs still are from developing true consciousness or deeper self-awareness. See also my detailed blog post on the existential crisis of the two AI-generated podcast hosts.

 

Extensions of the Turing Test: Understanding Creativity, Context and Physical Space 

The Turing Test alone is no longer sufficient to fully evaluate the intelligence of modern AIs. For this reason, various extensions and alternatives have been developed over the years to test new aspects of machine intelligence.


1. The Lovelace test for creativity 

Unlike the Turing test, which only tests a machine's ability to hold conversations, the Lovelace test goes further and asks: Can a machine be creative? Can it create a work so original that no human could have predicted how it was created? ChatGPT can write poems and stories, but these are based on data and patterns it has learned – not on true creativity in the human sense. Thus, despite impressive results, ChatGPT is far from truly demonstrating creativity.


2. Winograd Schema Challenge 

Another test that goes beyond the Turing Test is the Winograd Schema Challenge. This tests whether a machine is able to understand contextual ambiguity in language. For example, when you say, ‘The table doesn't fit through the door because it's too big,’ you, as a human, need to understand that the table is meant. Machines, like ChatGPT, can have trouble grasping such subtleties of meaning, though they are already making considerable progress in many cases.


3. Coffee Test by Steve Wozniak 

A suggestion by Apple co-founder Steve Wozniak aims to test a machine in the physical world. The so-called ‘coffee test’ demands that a machine should be able to make coffee in a stranger's kitchen by exploring the room, finding the necessary tools, and making the coffee. ChatGPT and other text-based AIs don't stand a chance here – they exist purely in language and have no physical interaction skills whatsoever.


Historical Milestones in AI Development 

Here are the most important AIs and machines that are considered milestones in the history of artificial intelligence. These examples have advanced the technology, but none of them have passed the Turing test – which shows that there is still a long way to go before a machine can be considered ‘thinking’.


1. ELIZA (1966)

  • Developer: Joseph Weizenbaum

  • Ability: ELIZA was one of the first programmes to simulate human conversation. It worked in the manner of a Rogerian therapist, repeating and rephrasing questions.

  • Distinguishing feature: Many users initially believed they were talking to a real person until the simple mechanisms behind ELIZA became clear. However, the reactions were so realistic that it represents an early example of the possible ‘deception’ by AI.

  • Turing test: ELIZA could not pass the Turing test because her answers were too repetitive and rigid.


2. Deep Blue (1997)

  • Developer: IBM

  • Ability: Deep Blue was the first AI to beat a world chess champion, Garry Kasparov. It could calculate millions of moves per second and used specialised chess algorithms.

  • Significance: Deep Blue's victory over Kasparov was an important step because it showed that machines could beat the best human players in a specific, highly-regulated domain (like chess).

  • Turing test: Deep Blue was specialised in chess and had no general conversation skills. It would not have passed the Turing test.


3. Watson (2011)

  • Developer: IBM

  • Ability: Watson won the quiz show Jeopardy! against two of the best human players. It used machine learning and the analysis of language nuances to answer complex questions from different fields of knowledge.

  • What makes it special: Watson was able not only to retrieve facts but also to understand questions involving puns and double entendres, marking a milestone in natural language processing.

  • Turing test: Despite its impressive performance on Jeopardy! Watson failed the Turing test because it was specialised in facts and could not hold a general human conversation.


4. Siri (2011) and other voice assistants

  • Developer: Apple (Siri), Google (Assistant), Amazon (Alexa)

  • Ability: Voice assistants such as Siri, Google Assistant or Alexa can respond to voice input, answer questions, perform tasks such as creating appointments and retrieve general information.

  • Special feature: These technologies made artificial intelligence accessible for everyday use. They simulate conversations and engage in simple dialogues with users.

  • Turing test: Despite their extensive abilities in everyday conversations, these assistants can still be recognised as machines in deeper or emotional conversations and fail the Turing test.

 

5. AlphaGo (2016)

  • Developer: DeepMind (Google)

  • Ability: AlphaGo defeated the world champion in Go, a strategically highly complex board game with many more possible moves than chess. The AI used machine learning and neural networks to develop and improve its own strategies.

  • What makes it special: AlphaGo's victory over humans was groundbreaking because Go requires much more complex thought patterns than chess. AlphaGo learned through millions of games and was able to make unpredictable moves.

  • Turing test: AlphaGo was specialised in the game of Go and was not able to hold human-like conversations. It did not pass the Turing test.

     

6. GPT-3 (2020)

  • Developer: OpenAI

  • Ability: GPT-3 is a language model capable of generating human-like texts. It can answer questions, compose texts, write stories and even perform creative tasks such as poems and literary works.

  • Special feature: GPT-3 represents a huge step forward because it was trained on a huge amount of data and can create very natural-sounding texts. It is able to respond to almost any conceivable context.

  • Turing test: GPT-3 can simulate deceptively real conversations in some cases, but longer or emotionally complex dialogues still show its machine limitations.


7. LaMDA (2021)

  • Developer: Google

  • Ability: LaMDA (Language Model for Dialogue Applications) was developed specifically to hold human-like conversations. It has been trained on dialogues and can respond to voice input in a variety of ways, including hypothetical scenarios and personal opinions.

  • Special feature: LaMDA is impressive in its ability to maintain natural conversations with a consistency that goes beyond simple question-and-answer patterns. It can hold longer dialogues and shows a high degree of linguistic flexibility.

  • Turing test: LaMDA was developed to pass the Turing test in terms of conversations, but here too, there are still limits, especially when it comes to deeper emotional interactions or questions of self-awareness.


These milestones in AI development have all made important contributions to the advancement of artificial intelligence. Each of these machines and systems was revolutionary in its field, but none of them could fully pass the Turing test because they all lack real self-reflection, consciousness and emotional intelligence. They show how impressive AIs can be in specialised tasks, but also how far we still are from a machine that can truly think and act.


Conclusion: How far are we from solving the Turing Test? 

The Turing Test remains a fascinating goal in AI research. Even though modern systems like ChatGPT or the Deep Dive podcast may seem very human to us, they still show their machine roots in deeper interactions. Whether it's a lack of emotional depth, a lack of creativity or an inability to understand the physical world, there is still a long way to go before we reach true human intelligence. Until then, the Turing test remains a benchmark by which we can measure artificial intelligence and recognise its limitations.


A humorous image on AI quality assessment: A robot with data charts observes a confused hamster in front of facial recognition, a pizza with glue, and a rock labeled as "food."

Anecdotal Evidence or Systematic AI Research – The Current Situation and What Still Needs to Be Done

AI & UXR, CHAT GPT, HUMAN VS AI, OPEN AI

Futuristic cosmic scene featuring the glowing number 42 at the center, surrounded by abstract technological and galactic elements.

What ‘42’ Teaches Us About Change Management and UX

AI & UXR, CHAT GPT, OPEN AI, UX

An abstract humanoid outline formed of handwritten notes, books, and flowing ink lines in soft pastel tones, surrounded by a cozy study environment.

Who Are We Talking To? How the Image of ChatGPT Influences Our Communication

AI & UXR, CHAT GPT, HUMAN VS AI, OPEN AI

Illustration of the Turing Test with a human and robotic face connected by chat symbols.

Why Artificial Intelligence Still Can’t Pass the Turing Test

AI & UXR, CHAT GPT, HUMAN VS AI, OPEN AI

two folded hands holding a growing plant

UX For a Better World: We Are Giving Away a UX Research Project to Non-profit Organisations and Sustainable Companies!

UX INSIGHTS, UX FOR GOOD, TRENDS, RESEARCH

Two humanoid robots in a futuristic studio reflecting on their existence. Dark atmosphere with a digital backdrop.

Does an AI Understand Its Own Existential Crisis?

AI & UXR, CHAT GPT

Several people laugh at an AI robot screen displaying "Error 404: Humor not found," surrounded by speech bubbles, coding symbols, and books about humor.

Does an AI Understand Jokes?

AI & UXR, CHAT GPT

A futuristic book with a glowing cover, surrounded by digital light streams and AI symbols like networks and binary code.

Everywhere All At Once – How AI is Changing Our World and What We Can Gain

AI & UXR, CHAT GPT, OPEN AI

Symbols for New Year's resolutions, motivation, and AI support.

Successfully Implement New Year’s Resolutions and Discover Personal Motivators With ChatGPT

AI & UXR

A brain, half sharp, half pixelated, symbolises remembering and forgetting. Subtle ChatGPT logo in the background.

Remembering and forgetting with ChatGPT - A guide for beginners

AI & UXR

A cozy Christmas table with a laptop, gifts, a cup of cocoa, and festive decorations, showcasing creativity and humor for the holidays.

The ‘Christmas Prompts’ - Practical & Fun Ideas for the Festive Season

AI & UXR, TRENDS

A futuristic humanoid figure made of glowing digital code fades into a neural network background, symbolizing AI and consciousness.

Hollywood’s as AIs vs ChatGPT: What Film AIs Have In Common With ChatGPT (And What They Don’t)

AI & UXR, CHAT GPT, HUMAN VS AI

Medieval image of a scholar with a scroll, surrounded by floating symbols representing errors and hallucinations.

Calculating With AI: A Story of Mistakes and Coincidences.

AI & UXR, OPEN AI, HUMAN VS AI

A dark, satanic-themed image featuring a menacing devil's head with horns, surrounded by gothic and occult symbols, including pentagrams and flames. The phrase 'The Devil is in the Details' appears in bold gothic font in the center, with red and black colors dominating the background.

Everything You Need to Know About Tokens, Data Volumes and Processing in ChatGPT

AI & UXR

Colourful image with typewriter, speech bubbles and pen representing different writing styles; background with fonts in varying typefaces for style diversity.

Your Own Writing Style and ChatGPT: A Guide to Proper Communication

AI & UXR

A women yells at a robot.

Being Nice Helps - Not Only With People, but Also With AI

AI & UXR

A face split down the middle with the left half being a robot and the right half a woman.

Male, Female, Neutral? On a Journey of Discovery With an AI - Of ‘Neutrality’ and Gender Roles

AI & UXR

A floating robot between many symbols of the English and German language.

German or English? How the Choice of Language Influences the Quality of AI Answers

AI & UXR

Image of a podcast cover on the topic of quality in UX research with two women on the cover.

Podcast: Why the quality of UX research can sometimes be a challenge

UX, UX INSIGHTS, UX QUALITY

Two people sitting at a table in a office in front of a laptop and discussing

Why User Research Is Essential: The Most Common Objections and How to Refute Them

UX INSIGHTS, STAKEHOLDER MANAGEMENT, OBJECTION HANDLING, ADVANTAGES USER RESEARCH

 RELATED ARTICLES YOU MIGHT ENJOY 

AUTHOR

Tara Bosenick

Tara has been active as a UX specialist since 1999 and has helped to establish and shape the industry in Germany on the agency side. She specialises in the development of new UX methods, the quantification of UX and the introduction of UX in companies.


At the same time, she has always been interested in developing a corporate culture in her companies that is as ‘cool’ as possible, in which fun, performance, team spirit and customer success are interlinked. She has therefore been supporting managers and companies on the path to more New Work / agility and a better employee experience for several years.


She is one of the leading voices in the UX, CX and Employee Experience industry.

bottom of page