[asvs] The Concept & Development Road Map

  • From: "Will Pearson" <will-pearson@xxxxxxxxxxxxx>
  • To: <asvs@xxxxxxxxxxxxx>
  • Date: Fri, 8 Oct 2004 19:18:59 +0100

Below, I've outlined the concept for the auditory synthetic vision system, as 
it stands at the moment.  I'm intending to use user centered design to make the 
software fit to the user, not the user fit to the software.  Due to this 
approach, the concept may change, but whatever changes are made will result in 
it being a better system for it's intended users.

I've also given a very brief development road map.  This outlines the key 
stages in the development of the system.  Each one is based around a different 
use for synthetic vision, and each will have testing involved with it, although 
some stages may require no additional development from that associated with the 
proceeding stage.

* The concept

Most people consider that we view shapes, as whole shapes.  This is in fact not 
the case, and we view little pixels of light, arranged in a grid, similar to 
that on which computer and television visual output is based.  It is then our 
brains, that perceive these shapes as shapes, through a series of rules, to 
which we then cognitively associate name and meaning.

Sound comes from all around us.  We can perceive sounds from the front, the 
rear, up, down, and all other directions.  So, it is quite conceivable that we 
can take advantage of this spatial ability of hearing to replicate the spatial 
and parallel abilities of sight

The proposed auditory synthetic vision system would use a grid of sound pixels, 
In doing so, this would closely approximate the spatial nature of sight.  This 
spatiality is an important aspect of the system.  If semantics are considered, 
then spatial relationships are one important form of conveying meaning.  An 
object may take on an entirely different meaning if it is located above another 
object, than if it is below it, and by having a parallel, spatially positioned 
display, spatial relationships between shapes and points within a shape can be 
determined, thus conveying the encoded meaning to the user.

Auditory definition is poor in comparison to visual definition.  Whilst 
visually we can determine the position of something to a fraction of a degree, 
auditorially, we can only determine it's position to approximately a degree.  
This results in the loss of fine definition, and a more coarse display.  This 
is paralleled in low vision, where the long standing solution is to magnify 
something to perceive the fine detail.  This has the potential to work equally 
as well auditorially, and so a system of magnifying a portion of the display 
will be a core component of the auditory synthetic vision system.

To convey color, a mapping between color and the pitch of the sound at that 
pixel's location will be made.  Therefore, different sounds will be used to 
convey different colors, with dark colors being represented by low frequency 
sounds, and bright colors by high frequency sounds.  This distinction between 
colors is important, as by grouping colors of the same frequency together, it 
abides by the Gestalt laws of similarity and proximity, and will help to 
perceive shapes.  Equally, differentiating between groups of one color, and 
groups of another color will aid determination of shapes, again utilising the 
Gestalt law of similarity.  Should all the pixels forming a shape move at once, 
then according to the Gestalt law of common fete, this system should be able to 
allow a user to perceive motion.

So far, one piece of contextual design has been incorperated into the concept, 
that is the ability to move the display.  Having sounds emitted from a 
direction, will effectively block sounds that also come from that direction.  
This may be undesirable if the user is in a meeting, a student in a lecture 
room viewing a set of slides, but who also wishes to hear the lecturer at 
times, and many other contexts.  Therefore, the ability to position the 
displayed grid in a position of the user's choosing will aid in the usability 
and usefulness of the system.

* Proposed Goals
The proposed goals of this system are simple and two fold.  Firstly, there is 
the research aspect, exploring whether this approach to auditory synthetic 
vision actually works in reality, and if it does, then what the limitations 
are.  The second goal is to deliver a piece of access technology that is 
affordable by, useful to and meets the needs of, the blind community.

* Key Stages In Development
1. Creation of a basic system to determine whether the concept works for still 
2. Determination of the system's ability to deal with moving images.
3. Testing as an actual "synthetic eye" using input from a web cam.
4. Evaluating the potential for a PDA version to increase system mobility.

If anyone has anything to add, then please feel free to contribute.



Other related posts: