RE: Sonified Debugger vs. Screenreader Question

  • From: Dónal Fitzpatrick <dfitzpat@xxxxxxxxxxxxxxxx>
  • To: <programmingblind@xxxxxxxxxxxxx>
  • Date: Fri, 23 Nov 2007 00:22:03 -0000

Hi there,

I'm afraid this is a fairly long mail so those not particularly interested
in this thread might want to skip it.  Most of the material in it is waffle
anyway. *smile*

I only joined this list on Monday, and I have been fascinated by this
debate.  My own research focussed for many years on providing access to
mathematics.  I am particularly interested in how complex data can be
conveyed in a non-visual way; and how prosody can be used to facilitate
this.  I must confess that I've moved away from mathematics to a large
extent.

Both T.V. Raman and subsequently Robert Stevens devised solutions involving
the incorporation of both spoken and non-speech audio to produce their
renderings of mathematical material.  Raman went a stage further in his
ASTeR system, by adding facilities to translate LaTeX documents into
synthetic speech.  This speech (in so far as the good old DEC Talk
permitted) was prosodically enhanced for clarity.

However, in my opinion (and this is purely a personal opinion) this system
fell down in that the interface was extremely difficult to use.  having read
the publications from that era, and indeed Raman's thesis there did not seem
to be much in the way of usability analysis carried out.  So we can sum it
up as a system designed by Raman for Raman.  This statement is in no way
intended to detract from this work, one must remember his Ph.D. was
concluded in 1994, and there had been little or no efforts in this highly
specific area before.  If you haven't done so, his Ph.D. thesis is well
worth a read.  Particularly noteworthy is the excellent bibliography at the
back.

Ok now let's jump to Robert's work.  This was a much more scientific (I use
this word here because I can't think of an alternative) approach.  He
analysed humans speaking equations, and derived a prosodic model for them.
Ahem, now I dare to disagree with the guru, because I am afraid I don't
agree with the phonological or prosodic model he defined.  I won't bore
people on this list by waffling on about it, but anyone interested might
want to take a look at the published works of Alex Monaghan, and Bob Ladd.
(I was fortunate enough to do my Ph.D. under the supervision of Alex
Monaghan; who was involved in the initial design of the Festival speech
synthesiser, and bob is one of the best pitch-perception people in the
world)

So this is where I came onto the scene.  I decided to try to produce
renderings of technical documents without using any non-speech audio;
relying instead on the prosodic alterations in speech.  The emphasis was
very much on keeping the utterances as brief as possible.  To facilitate
this, we carried out two pilot studies:
1.  We ran a test to see what sighted people saw when they looked at
equations for various prescribed periods of time.  The objective here was to
establish what kind of information we should provide to simulate a "quick
glance", a "quick flick" and an in-depth study of the equation.
2.  We then ran a pilot study to see if the prosodic model defined actually
worked.  Looking back at the experiment there are one hell of a lot of holes
in it, but eight years after graduating I have learned to live with it...
*smile*

Again, I won't bore people with the details, but anyone interested can read
various papers I wrote, and I'm more than happy to pass on either LaTeX
sources (or PDF) if anyone wants them.  

Sorry for the long-winded description of the recent research into
translating complex entities into audio, but I just wanted to set the scene
for people who might not have been familiar with this stuff.  The relevance
of the material just described might not be obvious, but what is being
proposed in the "sonified debugger" is in essence conveying fairly complex
data in an auditory way.  Therefore I think the work in the mathematical
sphere is relevant.  If I might recommend a few people's research that might
be of interest:

1.  Douglas Gillan.  He has carried out research into conveying mathematics
to blind people from the perspective of cognitive psychology.  His
methodologies might be of some use to you.
2.  Arthur Karshmer has produced a system called Mavis, which allows blind
people to navigate equations.  In essence, (working with Doug and others)
the system translates MathML into synthetic speech, and then allows the user
to move backwards and forwards through the structure and content.  The
interesting aspects of this research are in the way the system "chunks" the
material.  Again, it might be useful to you in terms of how to present and
browse material.  One thing I'll warn you though is that Arthur (or Art as
he's known) does not agree with me and feels that using prosody in synthetic
speech really does not help in disambiguating the utterance.
3.  Some of the work carried out by R. D. Jacobson (now at Calgary).  Dan
was involved with both haptic and auditory displays of geographic data, but
there might be something in it for you..  I seem to remember he had used
some audio cues (combined with haptic effects).  He was also doing some work
with Reg Gollege at UCSB.
4.  You could also take a look at some of the work carried out at UCSB by
the likes of Mary Hegarty, and Jack Loomis.  The work of Roberta Klatsky
might yield something.  These last three are all psychologists and are
collaborators with the researchers mentioned in (3)
5.  Some of the research carried out at the Sonic Arts Research Centre
(SARC) at queens University Belfast would appear to be relevant for you.  In
particular, the Ph.D. work of Emma Murphy.
6.  Finally, you might find something in the work carried out by Ian Pitt
and his team at University College Cork of interest.

Sorry I've wittered on a bit, but hopefully in all this
stream-of-consciousness I've been spewing there will be something useful for
you.  Good luck, and if I can be of any help feel free to contact off-list.

Dónal

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

Other related posts: