Re: Auditory interface ideas, what would help?

  • From: Chris Hofstader <cdh@xxxxxxxxxxxxx>
  • To: programmingblind@xxxxxxxxxxxxx
  • Date: Sat, 29 Aug 2009 08:44:11 -0400

I don't think that auditory interfaces need to be linear. Assuming you have a Windows computer, go to audiogames.com and download David Greenwood's "Shades of Doom" (there's a free demo which, for this purpose, is all we need). Install and start playing around with the game which has no visual interface whatsoever.


The game uses 32 simultaneous audio tracks and, at any point, the player may need to be paying attention to about half of them - 16 dimensions of information versus the uni-dimensional paradigm virtually all screen readers use today can make the entire process profoundly more interesting if we can find a way to give productivity tools a multi-dimensional interface without the prior knowledge of a situation which the game developer has in his toolkit.

These are notions that Will Pearson and I have been discussing for years. Way back in 2004, FS actually gave us funding to work on a 3D JAWS but the project died on the vine shortly after I left the company.

Happy Hacking,
cdh








On Aug 29, 2009, at 6:00 AM, Dónal Fitzp







atrick wrote:

Wow this is a truly fascinating thread. Ok to those of you not particularly interested, you might want to skip this mail because, in the true traditions
of academia, it will wander and witter along a bit.

Let's just take a step back for a second and consider the nature of visual reading, and auditory perusal of structured information. The visual sense is what I'd term a parallel, or inclusive sense while the auditory is a serial, temporal view of the information being examined. Ok I know I don't probably need to mention this, but it will serve as a starting point. My
own thoughts on this are that what is needed in any kind of auditory
interface is some way of filtering things out. I've long held the view that too many auditory interfaces try to replicate the visual domain, rather than
putting a slightly different spin on it.

Let me give an example to illustrate what I mean. Let's assume you're in a supermarket, and going down the biscuit (Sorry Cookie for those Americans
among you) aisle.  Now if you're using the visual sense to examine the
shelves, you can sweep over the various brands very quickly to find the one you're looking for. If, on the other hand, you are blind and are being guided by a shop employee, you have several alternatives two of which are: 1. you ask them to read absolutely every brand on the shelves (which takes
an eternity) or
2.  You filter, through dialogue or possibly their knowledge of your
previous purchases, what you want. You still get what you want (hopefully!)
but it takes less time.

So where am I taking this? Let's look at the constituents of programming languages; in this instance specifically Java. There are certain constructs
which _should_ be present in any reasonable-sized programs.  These
constructs are scope-based. For example, you have classes, which contain methods, which contain control-flow statements. The scopes can also contain
variable declarations, assignments and the like.  I think we need to
separate out the two constituents of the auditory interface - namely the
navigation of the content, and its auditory display.

So what is needed in terms of the navigation, is the ability to move from element to element within a certain scope, or indeed between related code elements. The analogy I'd use is very much that which has been devised by Apple in their VoiceOver screen reader - where one interacts with certain screen elements in order to "zoom in". In their new Snow Leopard, they have also introduced rotors which facilitate (they claim; I've not used SL yet so can't comment) very rapid navigation of web pages. From what I can glean, the rotors can change the navigation of web pages from, say form elements, or visited links, or headings or lists. I believe they can also be highly customised. My thought is that a simple rotor like this could be applied very easily to the domain of auditory navigation of the program. You could, for example, use the rotor to say "ok I want to jump from method to method". You hit the method you wish, then "interact" with it. You could then either through user interaction with the system, or indeed through intelligent algorithms, ascertain what code elements should be navigated next. The same key press could then move you, say, from declaration to declaration, or indeed from variable declaration to its next occurrence in the code. All that would have to be done is to change the rotor which controls the jumping mechanism. I probably haven't explained this well, but for a far better explanation take a look at Apple's overview of VoiceOver in Snow Leopard.

Ok now to the display. Again we must go back to how sighted people can determine errors. Andreas is absolutely correct when he says that auditory cues should be used in a very carefully thought-out manner. Badly placed,
or implemented audio cues can not only mask other content, but can
significantly impact on the cognitive load (which is already higher) of the person using the auditory domain to browse the information. So what I'd consider looking at here is "externalising" the sound source. One can argue and argue as to what constitutes an appropriate sound to indicate errors. For person X, it will be the sound of a dissonant chord; while for person Y it will be the word "error" spoken in a lower pitch. There has been some interesting work done in the domain of "spearcons" (I may have spelled that wrong sorry). In essence, these are akin to earcons as defined by Blatner years ago, but are based on very very rapid speech. We're actually playing with them at the minute in our work on the auditory display of mathematics, and they look useful. If some kind of HRTF processing were applied to the sound to make it seem like it was coming from outside the person's head, it might indicate that the system was telling the user that there was an error. Localising such externalised sound to a point directly in front of the user can be tricky, but even having it slightly off-centre might do the trick.

Ok I'm afraid that did go on a bit, and most of it was probably incoherent ramblings. However it is Saturday morning and I need a nice strong cup of
tea - just like Arthur Dent.

Good luck with the interface it will be fascinating to see the results.

Dónal

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Andreas Stefik
Sent: 28 August 2009 22:34
To: programmingblind@xxxxxxxxxxxxx
Subject: Re: Auditory interface ideas, what would help?

When a sighted individual navigates code, the most often navigate it by scrolling, often very quickly, up and down the source looking for something
they are interested in.

Right now, we're working to build some tools that we will hope will make it easier to scan the code looking for items of interest only using audio. While failure is always an option, I'm really hoping we can make scanning just as fast for the blind. The most obvious example I can think of is a "navigator window" that jumps to the beginning of a method. This solution, while fine for the sighted, requires one to change focus to a new window, finding what you want by browsing (not searching), then typing a key to jump
focus back and find what you want.

Here's a couple possible ideas. None of them are perfect, just brainstorms:

1. Press a key combination to jump to the "next point of interest." This might be the end of the current scope, the beginning of the next one, or
whatever. A cue would indicate where you jumped.

2. Have a series of hotkeys that jumps you to various places, like the
"next" or "previous" method, the end or beginning of a loop, if, or other construct. Requiring someone to remember lots of hotkeys seems like a bad
idea to me, but it's just a thought ...

So yaa, that's two ideas. I know Sina has told me in the past that
navigation amongst various files can be excruciating. Ideas related to that would be good as well. Search can obviously help, but we want an improved
"browsing" experience as well.

Hope that helps give you an idea of what I mean. Really, we're open to
pretty much any wacky idea people can come up with, that folks think might
help everyone program more effectively.

--
Andreas Stefik, Ph.D.
Department of Computer Science
Southern Illinois University Edwardsville


__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind


__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

Other related posts: