It’s time to retire the Turing Test
It was certainly an attention-grabbing headline: “Tech giant puts engineer on leave after he declares software to be sentient”. Reported by the Washington Post, Blake Lemoine was effectively sent home after making the wild claim that an artificial intelligence project that he was working on had effectively come to life, and told him so itself. Dramatic stuff.
Of course, it’s not the first time someone has mistaken machine output for human or human-like intelligence, and it is also not terribly surprising when you consider conversational AIs are often designed precisely to seem as though they are responding how a human would. The first conversation simulator, Eliza, modelled on psychotherapy, did a similar job in 1964, responding to typed sentences with requests for more information and occasional blandishments. Of course, Eliza wasn’t intelligent at all, it merely mimicked a subset of how humans communicated and left us to fill in the gaps, which we did.
Arguably, Eliza says more about the emptiness of much psychotherapy than it does machine intelligence, but that’s a question for another day. Suffice it to say that the ability to mimic human answers is no more proof of sentience than was Microsoft having to shutter its Twitter AI when Internet users ‘taught’ it to be racist.
Doubtless, Google’s conversational AI LaMDA is a good deal more versatile than the mere conversation simulators of the past, but just because a machine tells you it is alive doesn’t mean that it is. The clue, in fact, is in LaMDA’s name: Language Model for Dialogue Applications.
The problem is that our model for assessing a machine’s ability to exhibit intelligence, the Turing Test, also known as the imitation game, has been stretched too thin.
Devised by British computer scientist Alan Turing, the test sees a human interact with a screen, behind which might be another human or a machine. If the human cannot discern whether he or she is interacting with or a machine is misidentified as a human then the machine is displaying behaviour indistinguishable from human, and therefore thinking being behaviours.
The problem, though, is that the Turing test does not test for intelligence at all, nor is it intended to. In designing the test, Turing intentionally sidestepped the philosophical question of what thinking is, and while his intention was not to discern if a machine can fool a human into believing it is thinking, much AI today is intended to do something very close to this.
Paul Sweeney, head of product at Webio, the Dublin-based conversational AI developer, said other AI projects had already run up against this, including another from Google: Duplex.
“If this test were run today, there is a good chance that it would take a few minutes for someone to discover that they were not conversing with a real person. The machine would give itself away by not understanding a crucial language game of some kind. The language disfluencies used by [the intelligent assistant application] Google Duplex, mimic human pauses and are skeuomorphic, and in their way a form of language game. They mimic the idea that the computer is searching for ideas or for the right words, where clearly it is doing no such thing,” he said.
Sweeney said that one response to this might be to reassess our definition of thought.
“We know that a great imitator is not thinking like we think, but perhaps we need to open our own definitions of what thinking and cognition truly are before setting down sets of hard rules that set out to test for them,” he said.
Nonetheless, while it is true that right now AIs such as GPT-3 and DALL-E 2 are already capable of incredible feats, they are neither intelligent nor alive. But perhaps Lemoine has, in his rush to declare AI sentient, done us all a favour. Though the prospect is as far away as ever, it is overwhelmingly likely that if we ever created a true AI, assuming such a thing is even possible, that we would regret it.