RE: Auditory interface ideas, what would help?

  • From: Dónal Fitzpatrick <dfitzpat@xxxxxxxxxxxxxxxx>
  • To: <programmingblind@xxxxxxxxxxxxx>
  • Date: Sat, 29 Aug 2009 11:00:15 +0100

Wow this is a truly fascinating thread.  Ok to those of you not particularly
interested, you might want to skip this mail because, in the true traditions
of academia, it will wander and witter along a bit.

Let's just take a step back for a second and consider the nature of visual
reading, and auditory perusal of structured information.  The visual sense
is what I'd term a parallel, or inclusive sense while the auditory is a
serial, temporal view of the information being examined.  Ok I know I don't
probably need to mention this, but it will serve as a starting point.  My
own thoughts on this are that what is needed in any kind of auditory
interface is some way of filtering things out.  I've long held the view that
too many auditory interfaces try to replicate the visual domain, rather than
putting a slightly different spin on it.

Let me give an example to illustrate what I mean.  Let's assume you're in a
supermarket, and going down the biscuit (Sorry Cookie for those Americans
among you) aisle.  Now if you're using the visual sense to examine the
shelves, you can sweep over the various brands very quickly to find the one
you're looking for.  If, on the other hand, you are blind and are being
guided by a shop employee, you have several alternatives two of which are:
1.  you ask them to read absolutely every brand on the shelves (which takes
an eternity) or
2.  You filter, through dialogue or possibly their knowledge of your
previous purchases, what you want.  You still get what you want (hopefully!)
but it takes less time.

So where am I taking this?  Let's look at the constituents of programming
languages; in this instance specifically Java.  There are certain constructs
which _should_ be present in any reasonable-sized programs.  These
constructs are scope-based.  For example, you have classes, which contain
methods, which contain control-flow statements.  The scopes can also contain
variable declarations, assignments and the like.  I think we need to
separate out the two constituents of the auditory interface - namely the
navigation of the content, and its auditory display.

So what is needed in terms of the navigation, is the ability to move from
element to element within a certain scope, or indeed between related code
elements.  The analogy I'd use is very much that which has been devised by
Apple in their VoiceOver screen reader - where one interacts with certain
screen elements in order to "zoom in".  In their new Snow Leopard, they have
also introduced rotors which facilitate (they claim; I've not used SL yet so
can't comment) very rapid navigation of web pages.  From what I can glean,
the rotors can change the navigation of web pages from, say form elements,
or visited links, or headings or lists.  I believe they can also be highly
customised.  My thought is that a simple rotor like this could be applied
very easily to the domain of auditory navigation of the program.  You could,
for example, use the rotor to say "ok I want to jump from method to method".
You hit the method you wish, then "interact" with it.  You could then either
through user interaction with the system, or indeed through intelligent
algorithms, ascertain what code elements should be navigated next.  The same
key press could then move you, say, from declaration to declaration, or
indeed from variable declaration to its next occurrence in the code.  All
that would have to be done is to change the rotor which controls the jumping
mechanism.  I probably haven't explained this well, but for a far better
explanation take a look at Apple's overview of VoiceOver in Snow Leopard.

Ok now to the display.  Again we must go back to how sighted people can
determine errors.  Andreas is absolutely correct when he says that auditory
cues should be used in a very carefully thought-out manner.  Badly placed,
or implemented audio cues can not only mask other content, but can
significantly impact on the cognitive load (which is already higher) of the
person using the auditory domain to browse the information.  So what I'd
consider looking at here is "externalising" the sound source.  One can argue
and argue as to what constitutes an appropriate sound to indicate errors.
For person X, it will be the sound of a dissonant chord; while for person Y
it will be the word "error" spoken in a lower pitch.  There has been some
interesting work done in the domain of "spearcons" (I may have spelled that
wrong sorry).  In essence, these are akin to earcons as defined by Blatner
years ago, but are based on very very rapid speech.  We're actually playing
with them at the minute in our work on the auditory display of mathematics,
and they look useful.  If some kind of HRTF processing were applied to the
sound to make it seem like it was coming from outside the person's head, it
might indicate that the system was telling the user that there was an error.
Localising such externalised sound to a point directly in front of the user
can be tricky, but even having it slightly off-centre might do the trick.

Ok I'm afraid that did go on a bit, and most of it was probably incoherent
ramblings.  However it is Saturday morning and I need a nice strong cup of
tea - just like Arthur Dent.

Good luck with the interface it will be fascinating to see the results.


-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Andreas Stefik
Sent: 28 August 2009 22:34
To: programmingblind@xxxxxxxxxxxxx
Subject: Re: Auditory interface ideas, what would help?

When a sighted individual navigates code, the most often navigate it by
scrolling, often very quickly, up and down the source looking for something
they are interested in.

Right now, we're working to build some tools that we will hope will make it
easier to scan the code looking for items of interest only using audio.
While failure is always an option, I'm really hoping we can make scanning
just as fast for the blind. The most obvious example I can think of is a
"navigator window" that jumps to the beginning of a method. This solution,
while fine for the sighted, requires one to change focus to a new window,
finding what you want by browsing (not searching), then typing a key to jump
focus back and find what you want.

Here's a couple possible ideas. None of them are perfect, just brainstorms:

1. Press a key combination to jump to the "next point of interest." This
might be the end of the current scope, the beginning of the next one, or
whatever. A cue would indicate where you jumped.

2. Have a series of hotkeys that jumps you to various places, like the
"next" or "previous" method, the end or beginning of a loop, if, or other
construct. Requiring someone to remember lots of hotkeys seems like a bad
idea to me, but it's just a thought ...

So yaa, that's two ideas. I know Sina has told me in the past that
navigation amongst various files can be excruciating. Ideas related to that
would be good as well. Search can obviously help, but we want an improved
"browsing" experience as well.

Hope that helps give you an idea of what I mean. Really, we're open to
pretty much any wacky idea people can come up with, that folks think might
help everyone program more effectively. 

Andreas Stefik, Ph.D.
Department of Computer Science
Southern Illinois University Edwardsville

View the list's information and change your settings at

Other related posts: