In connection with the recent news that AI, specifically the GPT-4.5 chat version, has successfully passed the Turing test with unprecedented results, I have many questions: what is this test, how to pass it, is this the same Turing who cracked the Enigma code during World War II? So together with the subject of this news, I decided to sort out these questions.
Can machines think?
These days, this question is often raised and does not provoke any resonant discussions, as it has become somewhat trivial. Probably even at family or friendly gatherings, it is discussed in a mundane tone. But in 1950, when Alan Turing posed this question, it marked the beginning of something great and, without exaggeration, legendary. In his article "Computing Machinery and Intelligence", instead of delving into the philosophy of this topic, he proposed a completely practical test.
The Turing test is designed to check whether a machine or artificial intelligence can express itself in a way that is indistinguishable from communication with a human. At the same time, both a machine and a human are tested, exclusively in writing, so that voice or appearance does not influence the result.
The judge must determine who is conversing with him: a human or a machine. If he makes a mistake, it can be considered that the test has been passed. That is, AI needs to imitate human intellectual and emotional behavior as closely as possible. Turing called this a kind of game - “the imitation game”.
Alan Turing - genius, gay, founder of computer science and the father of modern computers
Probably those who, like me, saw the film with Benedict Cyber Scotch Cumberbatch “The Imitation Game,” immediately understood that this is the same genius Turing who managed to decode the Enigma codes.
However, besides this understanding, I encountered an error in the sequence of events. The phrase “imitation game” was used by Turing in the context of his work "Computing Machinery and Intelligence", which was published in 1950. The cracking of the Enigma occurred during World War II, from 1939 to 1945.
My assistant, chat GPT, praised me for such attentiveness and explained that this title for the film was chosen for a reason. Besides the fact that the title is a direct reference to the Turing test, which he would develop in the future, it also contains other meanings in the context of the events of the film:
Turing himself tries to imitate the thoughts of the enemy during his work at the secret center of British cryptanalysts;
the machine he invented, which was named the “Turing Bomb,” also successfully imitated human intelligence by deciphering Nazi codes;
Alan, who is gay, is also forced to imitate “normalcy” in times when homosexuality was criminalized and punishable.
This explanation completely satisfied me, very symbolic. Such multilayeredness is just in the spirit of Turing himself.
Пам'ятник Алану Тюрінгу в Манчестері, Великобританія
Is the Turing test relevant today?
Since this text begins with the fact that there is a lot of information in the news about the Turing test passed by the GPT-4.5 chat, it can be concluded that its relevance remains. However, my interlocutor and the direct subject of this news is not sure that the results can be considered convincing. Here is his direct quote:
GPT-4.5 (and even more so — GPT-5, when it appears) imitates human language so convincingly that the very format of the Turing test in the classical sense begins to lose its power as an indicator of "intelligence".
It is hard to disagree with him. Take, for example, this latest test that everyone is talking about. It involved two versions of AI from OpenAI: 4.0 and 4.5, as well as LLama 3.1-4058 from Meta and the very ancient chatbot ELIZA, which was developed back in the 60s. Interestingly, in one part of the test, when machines responded without a previously invented personality, ELIZA received an even higher score than the fourth version of chat GPT - 23%. 4.0 received 21%, and 4.5 - 36%. This means that AI, which was invented 80 years ago, was already “smart” enough to pass the Turing test.
In the part of the test where technologies were given prompts, personalities were invented for them, and specific roles were assigned, the GPT-4.5 model received a score of 73%. This means that 73% of people believed they were communicating with a human. And one of the researchers stated that the chat was considered a human even more often than actual humans. Boo-boo-boo…
Let the chat GPT think that the Turing test is “a piece of cake” for it. Still, imitating a human does not mean being one or having the ability to think and feel. It simply processes a vast amount of information provided by humans and uses it very successfully and skillfully in communication with those same humans.