What’s next.

Continued progress in reinventing the relationship between people and technology.

×

Multimodal interaction – How machines learn to understand pointing

Pointing at subjects and objects – be it with language or using gaze, gestures or the eyes only – is a very human ability. Smart, multimodal assistants, like in your car, account for these forms of pointing, thus making interaction more human-like than ever before. Made possible by image recognition and Deep Learning technologies, this will have significant implications for the autonomous vehicles of the future.

By
Smart multimodal assistants, such as Nuance Dragon Drive, now also include gaze detection based on eye-tracking

As we learn more about the biological world around us, the list of things only humans can do has dwindled – and that’s before computers started to play chess and Go. Counting? Birds can deal with numbers up to twelve. Using tools? Dolphins in Shark Bay, Australia, are using sponges as a tool for hunting. Against this background, it may come as a surprise how specifically human pointing is: Although it seems very natural and easy to us, not even chimpanzees, our closest living relatives, can muster more than the most trivial forms of pointing. So how could we expect machines to understand it?

 

Three forms of pointing

In 1934, the linguist and psychologist Karl Bühler distinguished three forms of pointing, all connected to language: The first is pointing “ad oculos,” that is in the field of visibility centered around the speaker (“here”) and also accessible to the listener. While we can point within this field with our fingers alone, languages offer a special set of pointing words to complement this (“here” vs. “there;” “this” vs. “that;” “left” and “right;” “before” and “behind” etc.). The second form of pointing operates in a remembered or imagined world, brought about by language (“When you leave the Metropolitan Museum, Central Park is behind you and the Guggenheim Museum is to your left. We will meet in front of that”). The third form is pointing within language: As speech is embedded in time, we often have the necessity to point back to something we said a little earlier or point forward to something we will say later. In a past blog post, I described how the anaphoric use of pointing words (“How is the weather in Tokyo?” “Nice and sunny.” “Are there any good hotels there?”) can be supported in smart assistants (and how this capability distinguishes the smarter assistants from the not-so-smart). And he first mode of pointing at elements in the visible vicinity is now also available in today’s smart assistants.

 

First automotive assistants to support “pointing”

At CES in Las Vegas this month, we demonstrated how drivers can point to buildings outside the car and ask questions like, “What are the opening hours of that shop?” But, the “pointing” doesn’t need to be done with a finger. With the new technology, you can simply look at the object in question, something made possible by eye gaze detection based on a camera tracking of the eyes. This technology is imitating human behavior, as humans are very good at guessing where somebody is looking just by observing his or her eyes.

 

 

Biologists suggest that the distinct shape and appearance of the human eye (a dark iris and a contrasting white surrounding) is no accident, but a product of evolution facilitating the ability of gaze detection. Artists have exploited that for many centuries: with just a few brush strokes of paint, they can make figures in their paintings look at other figures or even outside the picture – including at the viewer of the painting. Have a look at Raffael’s Sistine Madonna, which is displayed in Dresden, and see how the figures’ viewing directions make them point at each other and how that guides our view.

RAFAEL - Madonna Sixtina (Gemäldegalerie Alter Meister, Dresden, 1513-14. Óleo sobre lienzo, 265 x 196 cm).jpg

By Raphael – Google Art Project: Public Domain

Multimodal interaction: When speech, gesture, and hand writing work hand in hand

Machines can also do this based on image recognition and Deep Learning, capabilities which, coming out of our cooperation with DFKI, will bring us into the age of truly multimodal assistants. It is important to remember that “multimodal” does not just mean you have a choice between modalities (typing OR speaking OR handwriting on a pad to enter the destination into your navigation system), but that multiple modalities work together to accomplish one task. For example, when pointing to something in the vicinity (modality 1) and saying, “tell me more about this” (modality 2), both modalities are needed to explain what the person performing this wants to accomplish.

 

Multimodal interaction – a key feature for Level 4 and 5 autonomous vehicles?

While it is obvious why such a capability is attractive to today’s drivers, there are hints that it might become even more important as we enter the age of autonomous vehicles. Many people are wondering what drivers will do when they don’t have to drive any more, something they would experience in Levels 4 and 5 of the autonomous driving scale. Some studies indicate that perhaps the answer is not that much, actually. For example, a 2016 German study asked people about the specific advantages they perceived in such vehicles, and “… that I can enjoy the landscape” came out as the top choice at all levels of autonomy. It’s not too difficult to imagine a future with gaze and gesture detection, combined with a “just talk” mode of speech recognition – one where you can ask “what is that building?” without having to press a button or say a keyword first. This future will give users of autonomous vehicles exactly what they want. And for today’s users of truly multimodal systems, machines just got a little more human-like again.

Read full article

More from the editor

Creating change with awareness, resilience, and strength
CHIME survey uncovers CIO priorities and goals
Assistants here, assistants there, assistants everywhere
Why automotive assistants serve user’s needs better than any other virtual assistant
Why office automation is about giving jobs back (not taking them away)
Advanced PDF tools help employees focus on the most critical aspects of their jobs
Love in the Age of AI
Making customer experiences easier to love with AI
Enterprise mobility: When paperwork isn’t a 9 to 5 job
Mobile solutions improve documentation productivity for field workers and mobile profession
Gold standard vs. good enough
How to make all your customer service channels podium-worthy
Driving outcomes through innovation
Learn why technology isn’t the only consideration when it comes to innovation.
No stone unturned: Reducing risk in digital government optimization
Look at other industries and countries for some assistance with virtual assistants
GDPR is coming: how PDF security can help you comply
The right PDF software may provide more security than you think
New AI-powered tools help officers improve incident reporting
Documentation is key to successful policing
Why Punxsutawney Phil should not be the norm for prediction in customer service
Learnings from legends for predictive customer engagement
Human-Machine-Interaction: Making friends with your automotive assistant
Looking behind the scenes of the latest automotive user experience generation in the market
Be the MVP for Super Customer Service
Look to the Super Bowl for improving customer engagement
Documentation is on the move
Mobility trends and new cloud-based solutions help drive productivity
A heartfelt gift to last a lifetime
February is the month for Valentine’s Day and, appropriately, American Heart Month.
Battening down the printer hatches, for GDPR compliance
Fourth in a series on the impact of the EU’s General Data Protection Regulation
Nina for Google Home: The ultimate enterprise smart home hook-up
Nina for Google Home Extends AI-Driven Customer Service to the Popular Smart Home Device
Survey finds inefficient documentation processes cost officers time
Cumbersome documentation workflows impede report accuracy and take time from the community
Healthcare is personal
Connecting the work we do in making a difference in patient and physician outcomes.
Innovations continue to shape the legal industry
More legal professionals are seeking tech tools to improve documentation management
Multimodal interaction – How machines learn to understand pointing
Smart assistants combine speech, gesture and hand writing for human-like user experience
Finding the right tools for DIY IVR
IVR self-service has never been easier
What you need to know about KLAS best practices for EHR optimization
Education, personalization and culture are crucial
Become the psychic for your customer service
How artificial intelligence can be used to predict your customer’s intent
How Power PDF-Cloud will revolutionize the way you work
Cloud-based PDF tools offer a wide range of benefits for any business today
The buzz about disruption
AI bringing possibilities, not labor shifts
It’s time to get to know your customers better with actionable data insights
Celebrating "Get to Know Your Customers Day" via speech analytics
One realistic prediction
Humans and AI are learning to complement each other
Prepping for GDPR compliance with document capture and workflow solutions
Third in a series on the impact of the EU’s General Data Protection Regulation
Can cyber criminals “compromise speech recognition systems with ease”?
A response to a study on voice biometrics and speech recognition
Automotive HMI design: How AI can save the identity of car brands
The end of automotive design?
Achieve your New Year’s Resolutions with PDF best practices
All the resources you need to learn more, and do more, with PDF
2018 predictions: Five ways AI will make you love customer service this year
AI and customer service predictions for 2018
Over and Out – Moving beyond the walkie-talkie voice interface: Part II of “What’s left to tackle in voice technology”
One day we’ll view smart speakers as a sentimental technology
The positive outcomes of documentation productivity
To measure success, look at both tangible & intangible outcomes to improved productivity
Sheryl Sandberg quote about change
Creating change with awareness, resilience, and strength
CHIME survey uncovers CIO priorities and goals
Love in the Age of AI
Making customer experiences easier to love with AI
New idea, innovation, Incandescent lightbulbs laying next to one glowing, energy efficient lightbulb.
Driving outcomes through innovation
Learn why technology isn’t the only consideration when it comes to innovation.
New AI-powered tools help officers improve incident reporting
Documentation is key to successful policing
Be the MVP for Super Customer Service
Look to the Super Bowl for improving customer engagement
Our-fifth-article-in-series-on-GDPR-compliance
Battening down the printer hatches, for GDPR compliance
Fourth in a series on the impact of the EU’s General Data Protection Regulation
Healthcare is personal
Connecting the work we do in making a difference in patient and physician outcomes.
Develop your IVR on your own with the help of the right tooling suite
Finding the right tools for DIY IVR
IVR self-service has never been easier
PDF-converter-solutions-in-the-cloud
How Power PDF-Cloud will revolutionize the way you work
Cloud-based PDF tools offer a wide range of benefits for any business today
One realistic prediction
Humans and AI are learning to complement each other
The ever-increasing connectivity of the modern cars Artificial Intelligence is shifting the design focus to Human Machine Interfaces (HMI) rather than the bodywork
Automotive HMI design: How AI can save the identity of car brands
The end of automotive design?
walkie talkies are voice technology of the past just as today's smart speakers will fade into the past as well
Over and Out – Moving beyond the walkie-talkie voice interface: Part II of “What’s left to tackle in voice technology”
One day we’ll view smart speakers as a sentimental technology
Automotive assistants support the needs of people on-the-go better than any other virtual assistant possibly can.
Assistants here, assistants there, assistants everywhere
Why automotive assistants serve user’s needs better than any other virtual assistant
Enterprise mobility: When paperwork isn’t a 9 to 5 job
Mobile solutions improve documentation productivity for field workers and mobile profession
No stone unturned: Reducing risk in digital government optimization
Look at other industries and countries for some assistance with virtual assistants
Why Punxsutawney Phil should not be the norm for prediction in customer service
Learnings from legends for predictive customer engagement
Documentation is on the move
Mobility trends and new cloud-based solutions help drive productivity
The Virtual Assistant Nina for Google Home allows enterprises to bring their customer service to the smart home speaker
Nina for Google Home: The ultimate enterprise smart home hook-up
Nina for Google Home Extends AI-Driven Customer Service to the Popular Smart Home Device
Innovations continue to shape the legal industry
More legal professionals are seeking tech tools to improve documentation management
What you need to know about KLAS best practices for EHR optimization
Education, personalization and culture are crucial
The buzz about disruption
AI bringing possibilities, not labor shifts
Our-fifth-article-in-series-on-GDPR-compliance
Prepping for GDPR compliance with document capture and workflow solutions
Third in a series on the impact of the EU’s General Data Protection Regulation
Achieve-more-in-2018-with-PDF-converter-solutions
Achieve your New Year’s Resolutions with PDF best practices
All the resources you need to learn more, and do more, with PDF
The positive outcomes of documentation productivity
To measure success, look at both tangible & intangible outcomes to improved productivity
PDF-converter-software-empowers-employees
Why office automation is about giving jobs back (not taking them away)
Advanced PDF tools help employees focus on the most critical aspects of their jobs
Gold standard vs. good enough
How to make all your customer service channels podium-worthy
Our-fifth-article-in-series-on-GDPR-compliance
GDPR is coming: how PDF security can help you comply
The right PDF software may provide more security than you think
Latest advancements in technology assure a great user experience when interacting with the automotive assistant
Human-Machine-Interaction: Making friends with your automotive assistant
Looking behind the scenes of the latest automotive user experience generation in the market
A heartfelt gift to last a lifetime
February is the month for Valentine’s Day and, appropriately, American Heart Month.
Survey finds inefficient documentation processes cost officers time
Cumbersome documentation workflows impede report accuracy and take time from the community
Smart multimodal assistants, such as Nuance Dragon Drive, now also include gaze detection based on eye-tracking
Multimodal interaction – How machines learn to understand pointing
Smart assistants combine speech, gesture and hand writing for human-like user experience
Become the psychic for your customer service
How artificial intelligence can be used to predict your customer’s intent
It’s time to get to know your customers better with actionable data insights
Celebrating "Get to Know Your Customers Day" via speech analytics
Can cyber criminals “compromise speech recognition systems with ease”?
A response to a study on voice biometrics and speech recognition
2018 predictions: Five ways AI will make you love customer service this year
AI and customer service predictions for 2018
Show more articles

Feeds & threads

 

Loading…