Intelligent Personal Assistant (IPA) can behave as a software agent that can perform tasks, provide information about anything on the globe with/without the internet connection. Intelligent personal assistant technologies are the combination of mobile devices, application programming interfaces (APIs), and the proliferation of mobile apps. IPA has the ability to access information from a variety of online sources (such as weather conditions, traffic congestion, news, stock prices, user schedules, retail prices, etc.) as well as offline sources (such as set an alarm, set a reminder, schedule meetings etc.).

According to a study conducted by comScore, 200 billion searches per month will be done with voice by 2020. This will potentially create a $50+ billion per year market opportunity around voice searching. As demand for voice based commands both intensifies and diversifies, human-like language understanding is the primary focus of all the top leading IPA programmers.

Advantages of IPA includes speaking naturally and not the memorized commands. IPA uses natural language to speak back to the user, and even has a bit of a personality and humor. User doesn’t need to point-and-click; just simple voice activated commands such as ‘Hey Siri!’ or ‘Ok Google’ is enough.

Intelligent personal assistants are the future for the new generation, and a way to earn more money for the biggest players in Tech Company like Apple, Amazon, Facebook, Google and even Samsung also. Sooner, IPA will be used for different purposes like:

Primary interface for the connected home will be a voice, providing a natural means to communicate with kitchen appliances, alarm systems, sound systems, lights and more, as users go about their day-to-day lives.
More and more cars on the market will adopt intelligent, voice-driven systems for entertainment and location-based searching, keeping drivers’ and passengers’ eyes and hands free from executing such tasks.
Voice-controlled devices will also dominate workplaces that require hands-free mobility, such as hospitals, warehouses, laboratories and production plants.

Introduction

An Intelligent Personal Assistant (IPA) is a software which can run on any device and which helps the user to assess the information using his/her voice as a command to execute daily multiple tasks, sending messages, scheduling meetings, and finding places or finding information on the web. Also, an IPA will answer queries and perform actions via voice commands using a natural language user interface. It has the ability to organize the events and maintain the information. It can also deliver more accurate and relevant information.

The amalgamation of artificial intelligence with machine learning, and voice recognition technology has resulted in an IPA which can understand the user’s input and become better at predicting the needs as the interaction progresses.

Several Assistants are available as part of smart phone applications and may also be enabled for Internet of Things (IoT) integration. As said earlier, IPA applications are user interactive application and can perform concierge-type tasks as well (e.g., making dinner reservations, purchasing event tickets, making travel arrangements). These days, companies like Apple, Google, Microsoft, Amazon, Samsung, and LG have their own IPAs embedded in their devices. Google and Microsoft have also made available their IPAs on iOS App store as stand-alone applications.

Technology Overview

An IPA can perform multiple tasks and reduce human efforts with new innovations already coming on a daily basis in this technology. The latest updates to these IPAs also allows access to the third party application like Whatsapp, Uber, play music, Lyft etc.

1. Siri (Apple)

Siri is an intelligent personal assistant from Apple and comes as an embedded software in Apple’s iOS devices. Siri allows sending messages, schedule meetings, find places, browsing information and making phone calls.

Originally, Siri was introduced by Siri Inc. as an iOS application available in Apple’s App Store. Siri Inc. further announced that their software would be available for BlackBerry and Android. In 2010, Siri was acquired by Apple and since then it is available exclusively on Apple iOS devices.

With a formal launch on October 4, 2011, Siri was introduced for Apple iOS Platform and came as a marquee feature for iPhone4S. Siri was added to third generation iPad in May 2012, with the release of iOS 5.1.1 and since October 2012, Siri has been part of all iOS devices.

To use it, just press down on the home button or say ‘Hey Siri’ and wait for ‘Siri’ to pop up. The user can then talk to Siri, rather than typing into a box. Siri can help in getting things done — like sending messages, placing calls, and making dinner reservations. Siri requires minimal response time for completing the task. User also have the option to choose the voice gender of Siri.

According to patent US 8,086,604 titled ‘Universal interface for retrieval of information in a computer system’, a computer-human interface for quickly and easily retrieving desired information in a computer system with the help of using a plurality of heuristic algorithms(same algorithm required to build a Siri application) to identify an item of information (e.g., document, application or Internet web page) in response to at least one information descriptor (either by voice input to the microphone or by manual input to the keyboard, which is displayed in the text box) is described.

Apple’s recent granted patent on April 19, 2016 with Publication Number U.S. 9,318,108 titled ‘Intelligent automated assistant’ describes an automated assistant receiving a user input in a conversational manner using natural language and an output processor component, displaying an output on the basis of input received by the user.

Another Apple’s patent with publication number U.S. 9,300,784 titled ‘System and method for emergency calls initiated by voice command’ explains a method for a digital assistant to provide emergency call functionality with receiving a speech input from a user. Device includes processor and memory for storing the instructions for execution and determining a local emergency dispatcher telephone number based on a geographic location of the device.

For explaining the Application of Siri, Apple’s publication U.S. 20130328667 titled ‘Remote interaction with Siri’, describes an electronic device receiving a command from another electronic device via a connection using an interface circuit on the electronic device, providing the command from the interface circuit to a program module and executing the command. In simple words, a user can interact with home appliances with the help of Siri.

With iOS 10, Siri started supporting third-party apps and thus ended the frustration of users who could only use Siri on Apple’s apps. It can now play apple music as well as songs from Spotify. Likewise, it can now be used to post on Twitter which was not possible earlier.

2. Google Assistant (Alphabet)

Google Now is an intelligent personal assistant developed by Google. Just like Siri, Google Now can answer the queries using a Natural Language User Interface. Google Now includes ‘Now cards’, Voice Search and other commands. In 2012, Google launched the application Google Now by adding it to android phones. It was clearly Google’s response to Apple’s Siri. To access it, user has to use the phrase ‘Ok Google!’ and the voice assistant pops up.

Google Assistant is an intelligent personal assistant that works like a user’s personal Google and can engage in two-way conversation. In May 2016, Google’s Artificial Intelligence was boasted further this year with Google Home, an Amazon Echo competitor, and a new messaging app called Allo, both bearing a voice-commanded helper known simply as Google Assistant.

Google Now and Google Assistant are still two separate things – we can say that Google Assistant is advanced version of Google Now. Google Now works from within the Google Android and iOS app, while Google Assistant is found in Google’s Allo chat app and integrated into Google Home and Google’s Pixel phones. Pixel is the first phone with Google Assistant built in. Google Assistant is Google’s artificial intelligence agent that can respond to natural language and text queries to complete tasks on the phone within both native and third-party apps, as well as return information that would normally require a manual internet search.

Google owns good amount of patents in its arsenal which explains technical aspects required in Google Assistant’s application.

A Patent number U.S. 8,538,984 titled ‘Synonym identification based on co-occurring terms’ explains a computer-implemented method which first identifies query term in original query and then identifies a candidate synonym(identified the relevant synonym required) in the query term and based on it determines the confidence value(output value/result of the input query).

Google through its patent granted on December 1, 2016 has explained ‘how the Personal assistant searches and provides the information using enterprise content’. U.S. Patent No. 20160350134 titled, “Personal assistant providing predictive intelligence using enterprise content”, describes technologies relating to access control for enterprise information, personal assistance based on enterprise information and personal information, and searches associated with the enterprise information.

In another patent, U.S. 8,271,413 titled ‘Providing digital content based on expected user behavior’ the modified information at a time related to the event to the user in step by step method is described. In the first step the information is obtained, in the second step, a time-dependent activity is identified, correlating an observed interest by a user, selecting the information that pertains to the event based on the observed interest by the user in the event and the user’s geographic location and the last step includes modifying the selected information and providing it to the user.

3. Cortana (Microsoft)

Cortana is an IPA and knowledge navigator from Microsoft. It can perform multiple functions like set reminder, recognize natural voice without the requirement for keyboard input. Cortana is built on Microsoft’s previous voice technology called TellMe, purchased by Microsoft in 2009. In April 2014, Cortana was first demonstrated by Microsoft Build Developer conference in San Francisco. By default, the search engine used by Cortana is Bing (also from Microsoft) and also has a software platform dependancy.

Cortana takes over the Search part of Windows 10. Users can long-press the search button to make Cortana start listening, or a quick tap to type. User can also make calls, send text messages, search the web, schedule calendar events, ask for the weather and information about places to visit. It’s also available in multiple countries in multiple different languages.

In January 2015, Microsoft Developer’s team announced that the Cortana will also be available for Window10 in mobile phones and desktop.

In July 2015, Microsoft announced that Cortana will also be available for Android supported mobile phones. By December 2015, the official release of Android version along with an iOS version happened.

Microsoft’s publication number U.S. 20140059030 titled ‘Translating Natural Language Utterances to Keyword Search Queries’ represent the working part of Cortana, it explains that a user can use the natural language, make multiple queries and obtained the relevant results.

Another Microsoft patent with the publication number U.S. 20140201629 titled ‘Collaborative learning through user generated knowledge’ explained a combination of a natural language dialog and other non-verbal modalities of expressing intent (gestures, touch, gaze, images/videos, spoken prosody, etc.) may be used to interact with the personal assistant. The knowledge obtained from the different personal assistants is combined to form a collective intelligence. This collective intelligence is then transferred back to each of the individual personal assistants. In this way, the knowledge of one personal assistant benefits the other personal assistants through the feedback loop (used by a central knowledge manager to obtain information from different users and deliver learned information to other users).

Another patent publication number U.S. 20150382147 titled ‘Leveraging user signals for improved interactions with digital personal assistant ‘. It explains the user interaction with the Personal Assistant. Two signals are present, one is for availability of user, second is for mental or emotional state of the user and by further determines whether a particular time is an appropriate time to attempt to initiate a conversation with the user based on both the signals.

Cortana Intelligence basically analyzes the data intelligently and then gives the relevant solution in a form of action.

4. Alexa (Amazon)

Alexa or ‘Amazon Echo’ is an Intelligent Personal Assistant from Amazon which has the ability to interact with most of the third party apps. This is a huge advantage as even Siri got this feature only with iOS 10. Alexa is great with the third party apps supporting roughly 130 applications. User can ask for a car from Uber, play music on Spotify and even buy things online. On November 2014, Amazon Lab 126 developed the application, named as Alexa also known by the name ‘Amazon Echo’. Alexa is the name Amazon gave the digital assistant living inside its Echo home device, which started selling widely in June 2015.

While talking to the Alexa, user has to use the words ‘Alexa’, ‘Echo’, or ‘Amazon’. Alexa is tailor made for powering your Smart Home. It is amazing for handling home devices, buying things online – like adding items to your Amazon Shopping Cart or placing an order of your supermarket weekly shopping list when you run out of food.

Using this application, user can order a pizza, play a game, and arrange an Uber pickup. Echo has an ever-growing list of thousands of skills and counting. Alexa as a device is one of the top performing product for the smart homes. As said earlier, it works great with the third party applications and Amazon Echo’s speaker/microphone technology is exceptional.

The role of Alexa is bringing voice functionality to the user. For the mobile devices like kindle Fire Tablet (Launched by Amazon), the execution of Alexa is pretty much similar to the other IPA’s. Rather than clicking the Echo- Voice Command button, user can long press the home screen button and the assistant pops up. User can then ask question to the Alexa and follows the Natural Language Processing Algorithms by and the results are returned in a tremendous response time.

Now, Amazon Technologies, Inc. through their patent granted on June 18, 2013 has explains how the method for determining an endpoint during an automatic speech recognition (ASR) is processed. U.S. Patent No. 9,324,322 titled, “Automatic volume attenuation for speech enabled devices”, claims that a method of modifying the operation of a device which involves generating audio outputs, receiving an audio input through a microwave, performing echo cancellation on the audio input and determining that an audio input signal includes a sound which is not speech directed at the device. This technology is designed so that speech-enabled devices can respond intelligently to background noise, like a doorbell or ringing phone, in such a way which naturally improves the listening experience by increasing the volume or stopping playback without requiring a manual input from users.

Another patent with the Publication number US 9,351,073 titled ‘Enhanced stereo playback’, Amazon discloses a computer-implemented method which involves receiving position information of a listener, generating left and right cancellation signals based on position information and orientation information of the device and modifying left and right output audio signals with the cancellation signals. This innovation is designed to reduce sound distortion based on a listener’s location, right down to a person’s head position, to enhance stereo playback while the listener changes position.

5. S-Voice (Samsung)

S Voice is an intelligent personal assistant especially for Samsung mobile phones, capable of running a large number of tasks through voice command alone to save you time and effort. From the basics such as telling the time, providing weather updates or opening applications to navigating to a desired location, adding an appointment to your schedule or updating your social media accounts such as Facebook or Twitter, S Voice keeps your hands free to do other things.

On May 30, 2012, Samsung developed the application named as “S Voice” which helped Samsung to increase the growth in the field of mobile applications. S Voice also offered multitasking as well as automatic activation features. It is still an IPA which is a distant competitor to Siri (Apple’s) and Google Assistant (Alphabet) which are much better in comparison for the various reasons like response time, efficiency and understand the user requirement quickly.

On October 5th, 2016, a company called Viv was acquired by Samsung Electronics and it was revealed that Viv would be available on Samsung Galaxy S8. Also in November 2016, Samsung started to develop its first voice assistant based on technology gained through its acquisition of Viv. This voice assistant consists of two voices: Bixby (an intelligent personal assistant in male voice) and Kestra (an intelligent personal assistant in female voice) which will be a replacement of S-Voice by Samsung and is slated to be available in Samsung Galaxy S8 by the April of 2017. Also, a Bixby button will be available in Samsung Galaxy S8 near the power button.

According to patent, the Publication number U.S. 20160284351 titled ‘Method and electronic device for providing content‘ explained a method and an electronic device receive a voice input, a memory configured to store the voice recognition application and a processor configured to execute the voice recognition application and determine an output for providing content in response to a voice input.

6. Voice Mate (LG)

Voice Mate is an IPA designed to work with LG Devices and allows the user to search for programs and content and control the LG Smart mobile phone using his/her voice. The Voice Mate was actually introduced in the LG G2, but it has matured with the G3. Using this application, user can open apps, turn off services, call contacts, search, add calendar events, and much more.

A Patent publication number U.S. 9460717 titled ‘Mobile terminal and controlling method thereof using voice recognition‘ explains that a controller configured to activate a voice recognition mode on the mobile terminal for receiving the voice input via microphone, and execute a particular function related to the received voice input. If the voice recognition mode is interrupted and the microphone is deactivated while the particular function is being executed, it determines whether the particular function is in a complete state or an incomplete state. If the particular function is in the incomplete state, an object is displayed informing that the particular function is in the incomplete state by deactivation of the microphone, and activates the microphone for receiving additional voice input to complete the particular function when the object is selected by a user.

Launch Timeline for these IPA’s

Comparison Card

Siri, Cortana, Google Assistant, Alexa, S Voice and Voice Mate are becoming our everyday go-to assistants. Through voice interactions, they allow us to place phone calls, send messages, check emails, schedule appointments, navigate to destinations, control smart appliances, and perform banking services. We’ve compared all six digital assistants –

Tools \ Features	Siri	Google Assistant	Cortana	Alexa	S Voice	Voice Mate
Release Date	October 4, 2011	May 18, 2016	April 2, 2014	November 20, 2014	May 30, 2012	June 20, 2012
OS	iOS, watchOS, TVOS, macOS	Android 4.1+ (“Jelly Bean”), iOS 6.0+ and Chrome OS Limited functionality in Microsoft Windows, OS X, Linux	Windows, iOS, Android, Xbox OS	IOS 8.0 or later and Android 4.4 or later	Android 4.0 or later	Android 4.4 or later
Platform	iPhone, iPad, iPod Touch, Apple Watch, Apple TV, Mac	Google Cloud Platform, Android, iPhones and PCs	Windows 10, Windows 10 Mobile, Windows Phone 8.1, Microsoft Band 2, Microsoft Band, Android, Xbox One, Skype, iOS	Android and iOS	Android	Android
Method	Heuristic Algorithm	Natural Language Processing Algorithm	Subgroup Discovery Algorithm	Natural Language Processing Algorithms	Wolfram \| Alpha Algorithm	Jack Detection Algorithm.
Two Way Commu-nication	Yes	Yes	Yes	Yes	No	No
Uses	Sending message, schedule meetings, find places, browse the information and make phone calls.	Sending messages, schedule meetings, find places, browse information and make phone calls	Order Food, Book trips, transcribe video messages and make calendar appointments	Control Home appliances, Control Music, Manage Alarms, And View Shopping Lists.	Send message, Set a wake up alarm, Play music, web search (only on Samsung Devices)	Search for programs, content and control the LG Smart TV using voice (only LG Devices).
Limitat-ions	· Siri is only available in iPhone; · Siri can only interact with few third party apps.	· Not Available on Windows phone.	. Only one person can train ‘Hey Cortana!’ at a time . When Cortana is triggered under the lock screen, it will not perform all possible commands.	· Amazon Echo runs on the Bing search engine. · It cannot detect human emotions.	· Only used in inbuilt Samsung Products like Samsung Galaxy S4 to S7.	· Only used in inbuilt LG Products

Conclusion

Advancements in the Internet of Things (IoT), Artificial Intelligence (AI), and robotics have ensured that the new transfer of knowledge and skills will be machine to machine (M2M) rather than involving humans.

M2M refers to any technology that enables networked devices to exchange information and perform actions without human intervention, like ordering an Uber using Amazon Echo or by placing an order of groceries by adding it to shopping lists.

With increasing dependence on voice search and organizational platforms the rise of machine-to-machine communication has been prodigious, and this will have serious impact on the future of home devices, payment and commerce.

Voice commands, Natural language Voice recognitions and audio engagement will be the de facto way to interact with technology.

If you want to download in the form of a PDF, click here.

3 comments

Pingback: Citius Minds Blog | Home Automation with Smart Speakers – Amazon Echo vs. Google Home vs. Apple HomePod
Stuart says:

April 17, 2017 at 11:35 am

I like Siri more but Google Assistant is really quick..

Donette says:

April 15, 2017 at 12:58 am

Excellent article. Keep writing such kind of information on your blog. I’m really impressed by your blog.

War of Artificially Intelligent Personal Assistants