Innovating dialogue: How machines make sense of anaphora

This post is part of a series that explores the unique complexities of human speech and, consequently, how we create systems that appropriately take these complexities into account when interacting with users. In part three of this series, we examine how machines interpret anaphora, a rhetorical device with Greek roots.
in communication, speech systems are built to interpret and use rhetorical devices like anaphora

This post is part of a series that explores the unique complexities of human speech and, consequently, how we create systems that appropriately take these complexities into account when interacting with users.

Rhetorical devices are commonly used in our speech, and while we naturally come to use, recognize, and understand them in our daily lives, machines must be taught to do the same. Last month I introduced you to how machines make sense of paraphrasing and adult language, and today we’re going to explore another example: Anaphora.

By definition, Anaphora is related to ‘pointing’ within a text. More specifically, it’s the use of a word – like a pronoun – that serves as a reference to another word used earlier in the text. Take the below, for example:

machines interpreting rhetoric anaphora


(The opposite (pointing to a future part of the text) also exists, and is called Cataphora. An example would be: “I won’t show it to you, but I bought a fantastic new shirt,” where “it” refers to the (yet to be mentioned) shirt. But this is a much less frequently observed phenomena compared to anaphora, so let’s focus on the latter.)

Resolving anaphora means to find the so-called antecedent. In our example, this is: “Luigi’s Pizza Palace;” the referring expression is “they,” and a challenge here is not to equate “they” with the set consisting of “charging station” and “Luigi’s Pizza Palace,” but to know instead that the speaker is referring to the pronoun “Luigi’s Pizza Palace” only.

So, how can automatic systems do this? This is a topic several of our researchers explored in detail at the 2015 Nuance Research Conference. A first approach exploits the fact that certain syntactic rules and conditions apply to anaphor – antecedent pairs – especially, that they agree in gender, number and other characteristics. So a common method is to search to the left in your text and pick the first entity that fulfills these conditions. See here:

User: Find me a place to eat in my vicinity.

System: I found a nice steakhouse but there are two traffic jams on the way.

User: Is it expensive? / How long are they?

In the first user follow-up, “it” is singular in number, so is the antecedent, which can therefore be identified; in the second one, the plural “they” leads to the “traffic jams” (also plural). In this example the antecedent was found in the system response, but in other cases anaphora can also refer back to entities in the original user question:

User: How far is it to the next gas station?

System: 20 miles.

User: Does it have a shop?

Many use cases in Nuance’s Dragon Mobile Assistant today are based on variants of this algorithm. However, there are more complex examples that can’t be solved with just syntax. Instead, you need to build a ‘bridge’ from anaphor to antecedent involving knowledge or semantics. Look at the following example:

Mary stopped the car. She opened the door.

You know that “the door” refers back to “the car” (or rather its door, but not to a completely different door) because you know that cars have doors. If your system has a knowledge base that contains such “has part” relations then you can also resolve such anaphora. For some cases you also need to go a little deeper in your syntactic analysis:

Find a radio station with pop music, not a Spanish one, and switch to it.

Both the syntactic matching rule (antecedent needs to be singular) and the semantic analysis (antecedent needs to be a possible object of ‘switching to’) can’t decide which of the two potential candidates (“radio station with pop music,” “Spanish one”) is the right one. What is more, as “the Spanish one” is closer to “it,” there is a high risk that a naïve algorithm would choose it. Only when you take into account that “the Spanish one” is hidden in an embedded phrase can you know that “radio station with pop music” is the correct antecedent. We’re working on building systems that can interpret and act upon these cases in our labs today.

Stay tuned for my next post about how machines make sense of ellipsis.


Explore other posts in this series:

Innovating machine dialog: Brush up on your Greek and read Aristotle

Innovating dialog: How machines make sense of paraphrasing and adult language


Tags: , , ,

Nils Lenke

About Nils Lenke

Nils joined Nuance in 2003, after holding various roles for Philips Speech Processing for nearly a decade. Nils oversees the coordination of various research initiatives and activities across many of Nuance’s business units. He also organizes Nuance’s internal research conferences and coordinates Nuance’s ties to Academia and other research partners, most notably IBM. Nils attended the Universities of Bonn, Koblenz, Duisburg and Hagen, where he earned an M.A. in Communication Research, a Diploma in Computer Science, a Ph.D. in Computational Linguistics, and an M.Sc. in Environmental Sciences. Nils can speak six languages, including his mother tongue German, and a little Russian and Mandarin. In his spare time, Nils enjoys hiking and hunting in archives for documents that shed some light on the history of science in the early modern period.