[tabi] Google is preparing for screenless computers. This could be interesting from an accessibility perspective

  • From: K4NKZ Jim <k4nkz@xxxxxxxxxxx>
  • To: tabi <tabi@xxxxxxxxxxxxx>
  • Date: Thu, 15 Aug 2013 18:49:56 -0400

----- Original Message ----- Hi All,

For your information. Appended is today's article in Quartz.
This could be interesting from an accessibility perspective
as well.
Best wishes,
Peter Meijer
Seeing with Sound - The vOICe

Google is preparing for screenless computers.
By Christopher Mims.
The spread of computing to every corner of our physical world doesn't just mean a proliferation of screens large and small-it also means we'll soon come to rely on mobile computers with no screens at all. "It's now so inexpensive to have a powerful computing device in my car or lapel, that if you think about form factors, they won't all have keyboards or screens," says Scott Huffman, head of the Conversation Search group at Google.

Google is already moving rapidly to enable voice commands in all of its products. On mobile phones, Google Now for Android and Google's search app on the iPhone allow users to search the web via voice, or carry out other basic functions like sending emails. Similarly, Google Glass would be almost unusable without voice interaction. At Google's conference for developers, it unveiled voice control for its Chrome web browser. And Motorola's new Moto X phone has a specialized microchip that allows the phone to listen at all times, even when it's asleep, for the magic word that begins every voice conversation
with a Google product: "OK."

There's nothing new about voice interaction with computers per se. What's
different about Google's work on the technology is that the company wants to
make it as fluid and easy as keyboards and touch screens are now. That's a
challenge big enough that, thus far, it has kept voice-based interfaces from
going mainstream in our personal computing devices. And in cases when they
in use, such as interactive voice response systems designed to handle
service calls, they can be frustrating.

Interacting with a computer like it's a friend

"What we're really trying to do is enable a new kind of interaction with
where it's more like how you interact with a normal person," says Huffman.
illustrate, he picks up his smartphone and says "How far is it from here to
Hearst Castle?"

Normally, getting an answer to such a seemingly simple question would
googling "Hearst Castle," clicking on a map, and typing in your own address.
Huffman's phone gets the answer right on the first try-a neat illustration
how voice commands can save time and effort. In a way, it's part of the
progression of convenience in computer interfaces: 10 years ago writing an
required walking over to a computer, five years ago we could whip out our
phones, and in the near future we'll simply start talking.

Leveraging what Google already knows about reality

To achieve this kind of apparent simplicity, the Conversation Search group
to muster everything that Google already knows about the real world. That's
because, as anyone who has discovered that half the battle of learning a
language is absorbing the culture in which it's embedded, the meaning behind
language is always dependent on context.

"One thing that really helps us is the base of all the core relevance and
ranking work that the Google search engine is famous for," says Huffman.
Part of
that "relevance" is the Google Knowledge Graph, a database of people, places
things that allows Google to know, for example, that when you ask it for
movies" you are probably asking for the films of Tom Cruise, rather than
movies" or any of a number of other possibilities.

Beating humans at understanding meaning

This context doesn't just make Google's voice interfaces usable-some day, it
could make them even better than humans. "Today, automatic speech
recognition is
not as good as people, but our ambition is, we should be able to be better
people," says Huffman. In order to achieve that, Google will leverage the
intimate knowledge it has of its users.

"In some sense Google has a lot of context that [a human transcriptionist]
doesn't have," says Huffman. "We know where you are based on your phone's
location and there is some context around what you've been talking about
Therefore that should help us understand what kinds of things you might be

Computers that talk back

The future of Google's voice interfaces isn't just accurate interpretation
commands, but real interaction-hence the "conversation" part of Huffman's
Conversation Search group. One trick Google's voice interface can already do
understand pronouns like he, she and it. "You can ask yourself why in
do things like pronouns exist-well, they exist because it lets us
faster than we do without them," says Huffman.

To demonstrate, Huffman follows up his question about how far it is to
Castle with the sentence "give me directions," which doesn't even include
pronoun "it," but his phone begins rattling off directions in its tinny
computerized voice, anyway.

All of this is, of course, a demonstration laid out in advance for my
And like any other nascent technology it doesn't always work perfectly. At
points in Huffman's demo, his smartphone fails to understand the pronouns he's
using. One reason for that, he notes, is that Google's voice interface
the subject of any conversation with it after a certain amount of time. Just
in natural conversation, it has a limited attention span.

In conversation, a human being who has forgotten the referent for a pronoun
"it" might ask his or her companion what he or she is talking about. Google's
conversation search can't do that yet, but his team is working on it, says
Huffman. Already, Google's regular search results perform a version of this
you clarify?" task by suggesting search terms and providing other
links at the top of search results. Eventually, Google's voice search will
the same: "Did you mean the movies of Tom Cruise." or, given your search
"were you referring to the movies of Penelope Cruz?"

Fundamentally re-thinking the nature of computer interfaces

At this point, voice commands are a little-used feature of most people's
everyday interactions with computers, if we're using them at all. Between
present and a future in which we are reliably interacting with computers by
voice alone, there are a number of challenges, some of them fundamental to
we think of as a computer interface.

One challenge to voice control is simply reliability and error correction.
example, as Google Glass transcribes your words for an email, text or social
media update, you can actually see the ghostly words hovering in your field
view, but how does an interface that relies solely on our ears accomplish
same? Does it read our messages back to us?

Another issue is that current visual computer interfaces limit our options
ways that can make them easier to use. For example, in graphical user
we can find out what a program can do by clicking on all of its buttons and
looking under its menus. But commanding a computer by voice is more like the
model of interaction with a computer-the command line. It's a potentially
powerful interface-Huffman imagines a future in which we might even
with our computers via a verbal short-hand-but it would require that humans
learn a whole new way to control computers, and learn anew the capabilities
of all the software that might be used in this way.

This restaurant recommendation brought to you by vast, distributed neural networks
Ultimately, none of these issues may prove as insurmountable as the ones
that Google has already overcome by virtue of its enormous search database,
of the real world, cloud computing infrastructure and army of Ph.D.s who
work on
voice recognition and natural language processing. Currently, the everyday
of understanding voice commands is carried out almost entirely in the cloud,
because processing human speech is difficult enough that even a
smartphone doesn't have the processing power to do it at a high enough level

That means voice commands issued to Google's hardware and software are
shot into the cloud and parsed into next steps, rather than being handled by
device itself. "For speech recognition, it's a very data intensive thing,"
Huffman. "We use giant neural network things that are spread across many
servers." Which means that when we talk to our phones, there really is
listening to our every command-just not an intelligence we'd recognize as

Source URL:


Have A Nice Day, From, K4NKZ Jim B.D.T.B.

Check out the TABI resource web page at http://acorange.home.comcast.net/TABI
and please make suggestions for new material.

if you'd like to unsubscribe you can do so through the freelists.org web interface, or by 
sending an email to the address tabi-request@xxxxxxxxxxxxx with the word 
"unsubscribe" in the subject.

Other related posts: