[nvda-addons] Re: AudioScreen addon questions and suggestions

From: "Pranav Lal" <pranav.lal@xxxxxxxxx>
To: <nvda-addons@xxxxxxxxxxxxx>
Date: Tue, 12 Apr 2016 05:46:04 +0530

Hi Greg,

All good ideas. Audio screen is based on the vOICe which you can download for
free from http://www.seeingwithsound.com. You can zoom an image and manually ;
walk through it using the arrow keys.

In what program are you viewing the picture? You can sonify the entire picture
under the navigator object by hitting nvda key +alt+a.

Pranav
-----Original Message-----
From: nvda-addons-bounce@xxxxxxxxxxxxx
[mailto:nvda-addons-bounce@xxxxxxxxxxxxx] On Behalf Of Grzegorz Zlotowicz
Sent: Monday, April 11, 2016 6:12 PM
To: nvda-addons@xxxxxxxxxxxxx
Subject: [nvda-addons] AudioScreen addon questions and suggestions

Hello.
I just listened to the AudioScreen [1]  demonstration, and did some basic
testing myself.
First of all, thanks to Mick for trying to solve the image-viewing issue, as we
all know, it's important problem, but hard to solve.
My question: is it possible, after longer training, to view electronic
diagrams? After my one hour testing, no success, but maybe I should be more
patient.
For example, have a look at simple SR flip-flop diagram [2].
This diagram consists of two nand gates.
Each gate has two inputs and one output.
Each gate's output is connected to the input of other one.

Looking at this image as a whole, is impossible for me in audioScreen at the
moment, but - as I said - maybe it's lack of training.
I looked at it using optacon, which I just succeded to configure to work with
my laptop's screen in a nice way - due to lack of training in using optacon
too, it took me a long time to view the image, but it was doable.
But, I thought about alternative approaches to image viewing, which I'd like to
discuss here; maybe they could be developed as an alternative modes of
audioscreen or a separate add-on.
Mode 1.
The image in question (e.g. navigator object) is loaded into add-on buffer, and
in the view mode, user can use arrow keys to move inside of it by tiles
(up/down/left/right).
One tile is initially some part of the image, let say - 10% of it.
This initial look around, can be helpful to identify the colour of image, if
it's light on dark or dark on light.
Tile size can be increased (up to whole image) and decreased (up to one pixel
accuracy).
Tile value, as a gray-scale colour, can be read by the synthesizer or played by
the "seeingwithsound.com" algorithm.
If the tile consist of many pixels - an average of the colour is read as a
number.
User can label some parts of image e.g. after checking that somewhere is the
"r" letter, can label the area of this letter as "r".
When viewing the labelled part of image, given label is read aloud independent
of the tile size (so on small tiles user can identify letter, and then have it
read aloud when viewing image on greater tiles).

Mode 2. Using braille display:
When we select smallest representation of above SR flip-flop image, it has
200x125 pixels.
It means, that on a 32-cell braille monitor, it would occupy 3 lines (63
characters) horizontally and 32 lines vertically in full accuracy.
As in previous mode, one tile, represented here by one braille dot, could be
resized.
After increasing it 3 times of original - everything fits in 21 characters and
11 lines - it seems enough to read the image in pretty short time.
4-times  increase would give the size of 16 characters in 8 lines - possibly
too much to catch small details, but nice for having a quick preview...
The same for higher levels.
Routing keys could be used to move virtual "cursor" to interesting part of
image, e.g. when on high zoom I find the unidentified object, possibly letter,
I can route the cursor to its' left up corner, then switch the zoom to less
one, view the fragment in more details, and then label it (routing cursor keys
and rewind of display could be helpful to mark the label area).

The histogram analysis could be used to determine which value should be used as
an trigger of braille dot - the histogram statistics could be presented to the
user in both modes, after loading of image.

Labeling of rectangular fragments is simple, but what about labelling the lines
which aren't horizontal or vertical, and are intersected by other objects?
So at the beginning, maybe labels for points would be sufficient.
And as additional helper - a list of bookmarks function, where user can easilly
switch to previously labelled point.
I used this approach in my Grmapa program [3], and can say, that combination of
zooming and bookmarking allows for pretty efficient viewing of complicated
objects.
Of course, the OpenStreetMap objects are described itself in machine-readable
way, a raw image is not, so it would be user's work to do it.

Mode 3. The SVG view.
The SVG format seems a most accessible one when talking about images, because
of fact, that everything there is described as a concept, not group of pixels.
So, we have areas, paths, and text.
When we open in notepad the keypad, or house images (included in audioscreen
demo images archive), we can easily read the images.
For more complicated ones (map of Australia or the above mentioned SR
flip-flop), notepad reading is not enough.
This is probably idea of separated program, not a NVDA add-on, but the svg
viewing could became more accessible using the Grmapa approach.
Each object in SVG can be easily identified, so machine knows, that some
fragment of image is a line going from point a to point b.
If there is some text, no ocr is required to recognize it, because the text is
placed as text.
Some fragments of the image can have labels (Australia regions areas), some may
require the user interaction after investigating the line (SR flipflop).
Some paths can encircle the area, without declaring it (nand gates on a
schematic), so user should be able to manually convert line to area and label
it.
Main problem is the SVG rendering into the programs' internal buffer, but the
problem is well solved in other applications.
I'm seriously thinking about doing something like this as my engineering
thesis, but I recall that I read about some tool developed by (NASA, Microsoft,
or some other entity), which had to help blind to "see"
scientific diagrams, but can not find it.
Maybe it was the Earth plus, and math trax programs [4], which I just found.
Did somebody use it and can share some thoughts?

Oops, and I somehow went into off-topic, sorry for that, but from the other
side, mentioned things are closely related to the main subject, and can bee an
inspirration for further add-on development.

[1] https://github.com/nvaccess/audioScreen
[2] https://en.wikipedia.org/wiki/File:SR_Flip-flop_Diagram.svg
[3] http://grmapa.zlotowicz.pl
[4]
http://www.nasa.gov/audience/foreducators/helping-blind-see-math-science.html
Greetings, Greg.

----------------------------------------------------------------
NVDA add-ons: A list to discuss add-on code enhancements and for reporting
bugs.

Community addons are available from: http://addons.nvda-project.org To send a ;
message to the list: nvda-addons@xxxxxxxxxxxxx To change your list
settings/unsubscribe: //www.freelists.org/list/nvda-addons
To contact list moderators: nvda-addons-moderators@xxxxxxxxxxxxx

----------------------------------------------------------------
NVDA add-ons: A list to discuss add-on code enhancements and for reporting bugs.

Community addons are available from: http://addons.nvda-project.org
To send a message to the list: nvda-addons@xxxxxxxxxxxxx
To change your list settings/unsubscribe:
//www.freelists.org/list/nvda-addons
To contact list moderators: nvda-addons-moderators@xxxxxxxxxxxxx

Follow-Ups:
- [nvda-addons] Re: AudioScreen addon questions and suggestions
  - From: Grzegorz Zlotowicz

References:
- [nvda-addons] AudioScreen addon questions and suggestions
  - From: Grzegorz Zlotowicz

[nvda-addons] Re: AudioScreen addon questions and suggestions

Other related posts: