In what Winston Churchill called the single biggest contribution to the Allied victory over Nazi Germany during the Second World War, the breaking of the German naval Enigma code was a feat of international importance and technological prowess. The man behind the code breaking, Alan Turing, is now known as one of the most significant mathematicians, logicians, cryptanalysts and computer scientists of modernity, credited with both enabling a defeat of the Nazis and as “The Father of Theoretical Computer Science and Artificial Intelligence.” And while it is undoubtedly true that the former contribution holds a vital significance in our world, it his second role, the Father of theoretical computer science and Artificial Intelligence (AI), that has resurfaced in relevance today.
For many, Turing’s name is now familiar because of the recent flurry of news articles regarding the Turing Test. The Turing Test, proposed in 1950 by Alan Turing, essentially measures an artificially intelligent system by pitting a human against a computer program. In his 1950 paper “Computing Machinery and Intelligence,” Turing invents a sort of game to be played by a human, a machine, and an interrogator to ultimately answer the pivotal question: Can machines think? If the program can convince the interrogator that it is a human, the program has passed the Turing Test and has reached the intellectual capacity of a human.
But among the plethora of articles declaring the passing of the Turing Test and the intellectual supremacy of Eugene Gootsman, the computer modeled after a 13-year-old boy rumored to have passed the test, a similarly loud contingency declares that the test has, in fact, not been passed. The test has been changed too much in order to model an adolescent boy, and it still makes use of much trickery to fool the interlocutor, many argue.
As the voices advocating for the invalidity of Eugene’s achievement get louder and louder, research scientists from Nuance offer a robust and exciting solution: the Winograd Schema test, a test developed by Hector Levesque at the University of Toronto and winner of the 2013 IJCAI Award for research excellence that measures Artificial Intelligence more accurately. While the Turing Test asks for short, free-form conversation, the Winograd Schema approach asks for answers to a set of multiple-choice questions. The answers to these questions may seem easy to a mere human, but for a computer they can be exceedingly difficult. These questions will therefore offer a more accurate measure of genuine machine intelligence.
So, what exactly would this test look like? I’ll give you an example, and you can take the Winograd test yourself.
Question: The beach chair will not fit in the trunk because it was too big. What was too big?
Answer A: The beach chair
Answer B: The trunk
Now for humans, this answer seems obvious: Answer A must be true because the chair must be too big if it doesn’t fit in the trunk. It turns out, however, that most AI systems lack the everyday commonsense knowledge and reasoning to support such simple conclusions. A computer that is truly as smart as a human (or even close to it) must be able to answer questions such as these, and thus pass the Winograd Schema test.
There has been renewed interest in AI and Natural Language Processing (NLP) as a means of humanizing the complex technological landscape that we encounter in our day-to-day lives. The Winograd Schema Challenge provides us with a tool for concretely measuring research progress in commonsense reasoning, an essential element of our intelligent systems. Competitions such as the Winograd Schema Challenge can help guide more systematic research efforts that will, in the process, allow us to realize new systems that push the boundaries of current AI capabilities and lead to smarter personal assistants and intelligent systems.
Nuance is now hosting an annual competition to develop programs that can solve the Winograd Schema Challenge and advance the intelligence and technology of our artificially intelligent systems.
The test will be administered on a yearly basis by CommonsenseReasoning.org. The first submission deadline will be October 1, 2015. The 2015 CommonsenseReasoning Symposium, to be held at the AAAI Spring Symposium at Stanford from March 23-25, 2015, will include a special session for presentations and discussions on progress and issues related to this Winograd Schema Challenge. The winner that meets the baseline for human performance will receive a grand prize of $25,000. With a little more than a year to go, get started now on a new super computer to beat out 13-year-old Eugene and beat this new test of artificial intelligence: the Winograd Schema.