The Imitation Game, the World War II biopic about mathematician Alan Turing and his attempt to crack the enigma code, has garnered significant buzz, recently receiving an Oscar nomination for Best Picture.
The Turing Test, introduced by Turing in his 1950 paper “Computing Machinery and Intelligence,” is a test of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
The test as Turing described it is based on a parlor game called the Imitation Game, which involved a man pretending to be a woman or a woman pretending to be a man. The point of the game was to cleverly fool others into believing you are someone else.
The film’s release has reminded me of the confusing and inaccurate ways in which people have tried to apply the Turing Test.
The Turing Test as applied today is fairly narrow in scope and has become a metric that people try to game rather than use as a genuine benchmark. From the 1970s model of Ken Colby’s PARRY, a paranoid schizophrenic, to the most recent attempt with Eugene, a 13 year old Ukrainian chatbot for whom English was a second language, people try to game the rules by providing an “excuse” for the system’s inability to participate in a real conversation.
For example, with Parry, the explanation for odd and inconsistent behaviors was simply that the “person” you were talking with was genuinely paranoid. Bizarre and random answers were considered par for the course when talking with someone who had delusions about bookies, the mob and FBI informants.
Likewise, Eugene is excused from having knowledge of western pop-culture because “he” has had so little experience with it. And what of the fact that he couldn’t put together a coherent sentence in English? Well that is explained away because English is Eugene’s second language.
But even if everyone stopped trying to game the Test and instead took it seriously, the reality is that it is at best a proxy for us (as it was for Turing at the time) to understand something we don’t understand. Even today, we simply don’t have a real way to understand, evaluate or measure intelligence. And neither did Turing.
In fact, Turing proposed the test as a way to avoid defining intelligence. His stance was that the question of what constituted intelligence was simply too hard to answer, so why not try this test instead. Like an HR person interviewing a potential employee, he was looking for a method that would allow him to measure something in the hopes that it would correlate to being intelligent.
But it turns out that, just as giving a great answer to the question ‘what is your greatest weakness?’ has little to do with your ability to manage a quality control team; fooling someone into making them believe you are human doesn’t make you intelligent. A great interview does not always make a great worker.
It is odd that the only measure we can imagine for intelligence is to be just like us.
The real test should be focused on what these devices can do, that in turn enhances our productivity and makes us smarter. Can they synthesize information and build characterizations of the world that make sense to us? Can they communicate with us? Can they explain the world of massive data and analysis that we cannot understand in our terms? Can they act as a bridge between data and our world of ambiguity and messy language? That is, can they explain things to us even when we know full well that they are not 13-year-old Ukrainian kids?
If they can do this, then they will be intelligent.