[asvs] Re: The Concept & Development Road Map

  • From: "david poehlman" <david.poehlman@xxxxxxxxxxxxxxxxxxxxxxxx>
  • To: <asvs@xxxxxxxxxxxxx>
  • Date: Fri, 8 Oct 2004 16:58:44 -0400

will,

Is there a way to simulate up in a flat sterio environment?  The only way I 
can think of to do it would be to use pitch as height and than use loudness 
as color.  This mapping works quite well with the vOICe and does not require 
extensive multiphonic mapping.

Johnnie Apple Seed

----- Original Message ----- 
From: "Will Pearson" <will-pearson@xxxxxxxxxxxxx>
To: <asvs@xxxxxxxxxxxxx>
Sent: Friday, October 08, 2004 2:18 PM
Subject: [asvs] The Concept & Development Road Map


Hi'
Below, I've outlined the concept for the auditory synthetic vision system, 
as it stands at the moment.  I'm intending to use user centered design to 
make the software fit to the user, not the user fit to the software.  Due to 
this approach, the concept may change, but whatever changes are made will 
result in it being a better system for it's intended users.

I've also given a very brief development road map.  This outlines the key 
stages in the development of the system.  Each one is based around a 
different use for synthetic vision, and each will have testing involved with 
it, although some stages may require no additional development from that 
associated with the proceeding stage.

* The concept

Most people consider that we view shapes, as whole shapes.  This is in fact 
not the case, and we view little pixels of light, arranged in a grid, 
similar to that on which computer and television visual output is based.  It 
is then our brains, that perceive these shapes as shapes, through a series 
of rules, to which we then cognitively associate name and meaning.

Sound comes from all around us.  We can perceive sounds from the front, the 
rear, up, down, and all other directions.  So, it is quite conceivable that 
we can take advantage of this spatial ability of hearing to replicate the 
spatial and parallel abilities of sight

The proposed auditory synthetic vision system would use a grid of sound 
pixels, In doing so, this would closely approximate the spatial nature of 
sight.  This spatiality is an important aspect of the system.  If semantics 
are considered, then spatial relationships are one important form of 
conveying meaning.  An object may take on an entirely different meaning if 
it is located above another object, than if it is below it, and by having a 
parallel, spatially positioned display, spatial relationships between shapes 
and points within a shape can be determined, thus conveying the encoded 
meaning to the user.

Auditory definition is poor in comparison to visual definition.  Whilst 
visually we can determine the position of something to a fraction of a 
degree, auditorially, we can only determine it's position to approximately a 
degree.  This results in the loss of fine definition, and a more coarse 
display.  This is paralleled in low vision, where the long standing solution 
is to magnify something to perceive the fine detail.  This has the potential 
to work equally as well auditorially, and so a system of magnifying a 
portion of the display will be a core component of the auditory synthetic 
vision system.

To convey color, a mapping between color and the pitch of the sound at that 
pixel's location will be made.  Therefore, different sounds will be used to 
convey different colors, with dark colors being represented by low frequency 
sounds, and bright colors by high frequency sounds.  This distinction 
between colors is important, as by grouping colors of the same frequency 
together, it abides by the Gestalt laws of similarity and proximity, and 
will help to perceive shapes.  Equally, differentiating between groups of 
one color, and groups of another color will aid determination of shapes, 
again utilising the Gestalt law of similarity.  Should all the pixels 
forming a shape move at once, then according to the Gestalt law of common 
fete, this system should be able to allow a user to perceive motion.

So far, one piece of contextual design has been incorperated into the 
concept, that is the ability to move the display.  Having sounds emitted 
from a direction, will effectively block sounds that also come from that 
direction.  This may be undesirable if the user is in a meeting, a student 
in a lecture room viewing a set of slides, but who also wishes to hear the 
lecturer at times, and many other contexts.  Therefore, the ability to 
position the displayed grid in a position of the user's choosing will aid in 
the usability and usefulness of the system.

* Proposed Goals
The proposed goals of this system are simple and two fold.  Firstly, there 
is the research aspect, exploring whether this approach to auditory 
synthetic vision actually works in reality, and if it does, then what the 
limitations are.  The second goal is to deliver a piece of access technology 
that is affordable by, useful to and meets the needs of, the blind 
community.

* Key Stages In Development
1. Creation of a basic system to determine whether the concept works for 
still images.
2. Determination of the system's ability to deal with moving images.
3. Testing as an actual "synthetic eye" using input from a web cam.
4. Evaluating the potential for a PDA version to increase system mobility.

If anyone has anything to add, then please feel free to contribute.

Thanks,

Will



Other related posts: