Saturday, May 25, 2019
  • Prime Minister Narendra Modi unanimously elected as leader of BJP parliamentary party.

Posted at: May 26, 2018, 12:53 AM; last updated: May 26, 2018, 12:53 AM (IST)

Your twin’s here

Google Duplex imitates a human while on the phone. It will soon roll out for testing

Vaibhav Sharma

A few years ago, when voice commands were a novelty rather than the norm, we waited for the day when we could have a conversation with the machine, where it would understand context, and give intelligent replies to follow up questions. There was a time when the device would only respond if the question was asked in a specific manner, and where you spoke deliberately and slowly so the machine could keep up. That quickly changed when digital assistants like Siri came along, and the chat got more natural — but the machine would still sound artificial. This was because the vocabulary of the digital assistant was developed by having a human speak thousands of words and phrases, and it would rely on that word bank to speak to you.

The assistant

Modern assistants like Alexa debuted when machines had become smart enough to phonetically cobble together words and sounds from a much smaller set of words spoken by a human, and then go beyond the vocabulary so recorded. But at this year’s Google IO Conference, the company introduced the power of Google Duplex — a system so advanced that it can mimic a human while having a real-time telephone call with another human. Here’s how it works. You simply tell Google Assistant to book a haircut appointment at the nearby salon for say Sunday, around noon. The Assistant then places an actual call to the salon, talks to the operator as if it were a real human, and books the appointment. Soon enough you get a popup on your screen confirming the appointment. 

The fact that it can understand what you want and when you want it is understandable, but what is incredible is that it can actually hold a conversation with an unsuspecting human, negotiate the available time slot and end the call, all without the human realising that she was talking to a computer.

How it works

This astonishing feat is accomplished because of Google’s efforts over the years in deep learning, natural language understanding, speech recognition, and text-to-speech. Basically, Google has long been able to transcribe text from your voice, that’s how voice dictation works. Similarly, it also has been working on things like Google Translate, which converts text to audio should you choose. To a fair degree, it can also predict what you’re going to say based on a phrase you’ve written — that’s how predictive text works on smartphone keyboards and in Gmail. However, no company has so far been able to bring all of this together to work so seamlessly. But that isn’t even the best part. While talking on the phone, the Assistant mimics human tendencies to say ‘umm’ and ‘aah’ while talking — not only does this make it sound infinitely more human, it also given the computer that extra second to process. 

The challenges

For Google to be able to make such a complex system work, it needs to first concentrate on specific domains — like booking a table at a restaurant. The company states that the system isn’t at a stage where it could have a conversation on any subject, and still pass as a human. So, they have concentrated on getting the basics right — does “We’re okay for 4” mean that the restaurant is happy to book a table for 4 people or at 4pm? Add to that a patchy cellular connection, background noise, different accents and a busy operator — the perfect recipe for disaster. But in the demos that were shown off, the calls went through flawlessly. 

The system isn’t perfect yet, and Google will only begin testing it in a few areas this summer, but it is a fascinating look at where technology has progressed. 

Serious concerns

  • Is it ethically right for a human to be fooled into believing that he is talking to another person? 
  • In a world where children are exposed to digital assistants since a young age with the proliferation of smart speakers and such, will this teach them to be rude? Machines don’t get affronted or angry, nor do they complain about abuse. So, will the children of tomorrow think it is okay to talk like that?


All readers are invited to post comments responsibly. Any messages with foul language or inciting hatred will be deleted. Comments with all capital letters will also be deleted. Readers are encouraged to flag the comments they feel are inappropriate.
The views expressed in the Comments section are of the individuals writing the post. The Tribune does not endorse or support the views in these posts in any manner.
Share On