From someone who's in the trenches and working with toolkits like MSAA on Windows and AT-SPI on Linux, both on the client- and server-side, let me just say that it is almost a requirement today that a screen reader be scriptable. It's not that these toolkits aren't doing their jobs, it's that many applications do not use them correctly or to their fullest potential. Introducing a better accessibility toolkit (e.g. UIA) can certainly improve the richness of information available, but it will not automatically solve the problem of getting that information exposed in a standard format to the screen reader. Until 1) an explicit standard is formalized stating how information should be exposed by an app through an accessibility toolkit (think: W3C web guidelines but for using MSAA, UIA, etc.) and 2) software developers follow this standard, scripts are necessary to work around variations in toolkit use across programs. You can certainly design a screen reader that assumes both of these conditions are already met, but I'm pretty sure it would not work well today. All of this says nothing about using scripts to improve the usability of programs, only accessibility. Pete Quoting Sina Bahram <sbahram@xxxxxxxxx>: > Hi Jamal, > > That's true, but who is to decide which one of those pieces of information > is important? > > I thought the O in this project stood for Open? > > Why not let the user decide that ... > > I'm not being abrasive, honestly. > > But even after my long phone conversation with Will ... I just really don't > like the idea of not having a programmatic way of scripting this thing. > > Take care, > Sina > > ________________________________ > > From: ossrp-control-bounce@xxxxxxxxxxxxx > [mailto:ossrp-control-bounce@xxxxxxxxxxxxx] On Behalf Of Jamal Mazrui > Sent: Tuesday, May 31, 2005 4:18 PM > To: ossrp-control@xxxxxxxxxxxxx > Subject: [ossrp-control] Re: What Is A Screen Reader? > > > Hi Will, > I agree that a screen reader will often have to guess probabalistically, > rather than be able to know determinatively, all the semantic meaning > intended by visual aspects of layout, including spacing, fon choices, etc. > My hope though is that we develop the huristic analysis capability as much > as possible, separating what is functionally significant from visually > decorative or at least redundant. > > What is the purpose of the dialog? What task does it enable the user to > accomplish? These are more importantquestions, in my opinion, than what are > all the visual effects presented to a sighted user? > > Jamal > ----Original Message----- > From: ossrp-control-bounce@xxxxxxxxxxxxx > [mailto:ossrp-control-bounce@xxxxxxxxxxxxx] On Behalf Of Will Pearson > Sent: Sunday, May 29, 2005 2:39 PM > To: ossrp-control@xxxxxxxxxxxxx > Subject: [ossrp-control] Re: What Is A Screen Reader? > > > Hi Jamal, > > It's difficult to quantify the semantics conveyed by a particular physical > encoding scheme in a general sort of way. There's multiple characteristics > of space, namely position, proximity and size. These can convey semantics, > or they may not, in one scenario, size may be used to convey relative > importance, whilst in another scenario it may be used to differentiate > between two or more groups of items by having the items in each group a > different size compared to those in another group. This is also true for > other mechanisms by which semantics can be encoded visually, such as color, > font, and font attributes such as bold, italic, etc. So, it's very > difficult to determine what semantics, if any, are encoded using a > particular technique in a scenario devoid of context. > > There's a couple of ways that this can be presented to a user. Firstly, you > can have some form of intelligent system that will extract the semantics > from the visual presentation and convey the semantics to the user, either in > raw form using additional spoken words, or by altering the attributes of the > spoken text associated with the item for which the semantics are being > conveyed. The major drawback to this is that it's very hard to create the > intelligence to do this in an autonomous manner, and so the system would > have to be taught the relationships between encoding techniques and the > semantics they convey in the various different contexts. > > The second method of conveying the content is to take the encoded semantics, > e.g. spatil position, color, etc. and convey this to the user but altering > it's physical presentation. So, instead of conveying spatial relationships > through parallel presentation of different elements, you could convey the > spatial positioning of each element, and thus the rrelationship, through a > series of spoken co-ordinates, which would still leave space as the encoding > technique for conveying semantics, but would modify the physical > representation of that encoding to speech. A similar thing could be done > for color, where instead of altering te wavelength of the displayed element, > you could just use speech to say the color of the item, which would still > leave color conveying the semantics. Alternatively, you could have parallel > auditory displays that use different frequencies/wavelengths to present the > information. > > That's just some of the ways in which it could be done, and they're by no > means designs. One point that I think needs to be born in mind when > thinking about this sort of thing is the limitations of speech. Firstly > it's serial in nature, and so the more you produce in terms of speech the > more time it takes someone to receive that semantic content, and secondly, > people have a short term memory limit of between five and nine chunks of > information, at least according to George Miller's 'Magical number seven, > plus or minus two' theory. Speech being serial in nature doesn't allow a > user to very easily quickly jump back to a position to review the content at > that position, so people tend to have to remember the content as they go, > and this is stored in their limited short term memory. > > I think it's something that needs some careful consideration to come up with > the optimal design. > > Will > > ----- Original Message ----- > From: Jamal Mazrui <mailto:Jamal.Mazrui@xxxxxxx> > To: ossrp-control@xxxxxxxxxxxxx > Sent: Thursday, May 05, 2005 9:22 PM > Subject: [ossrp-control] Re: What Is A Screen Reader? > > Good point. That is a different, more complex example, but > certainly such scenarios are also common. I think the scenario that I > described often occurs with order forms on the web, typically asking for > contact and credit card information in a familiar pattern. > > In evaluating this issue, at least two questions seem relevant: (1) > what does spatial information convey about the function of the dialog? and > (2) to the extent that functional rather than asthetic information is being > conveyed, what is the best way to achieve an equivalent result nonvisually? > > > If there are optional subgroups of fields, then tabbing through all > of them is, indeed, inefficient. To achieve productive data entry, let us > separate function from presentation. The blind person probably does not > need to know, for example, how many pixels separate controls in order to > judge which ones are part of the same subgroup and which are part of > another. The fact that the border of group boxes uses a 3D rather than > simple style is inconsequential. The objective is to enable the blind > person to identify and navigate to the different subgroups. For the screen > reader user, a multi-page tab dialog might be the most efficient solution > rather than a single page dialog where subgroups are indicated by spatial > proximity. > > Jamal > > > -----Original Message----- > From: ossrp-control-bounce@xxxxxxxxxxxxx > [mailto:ossrp-control-bounce@xxxxxxxxxxxxx] On Behalf Of David Lant > Sent: Thursday, May 05, 2005 3:46 PM > To: ossrp-control@xxxxxxxxxxxxx > Subject: [ossrp-control] Re: What Is A Screen Reader? > > > Hi Jamal, > > Broadly I understand your point. However, there are situations > where simply going sequentially through the items in a dialog is not the > process required for daily use of a facility. If, for example, your job is > processing pay claims, and allocating charge codes to the relevant portions > of hourly rates, overtime rates, expenses and so on, it would, and does, > become extremely tedious having to tab through all the fields that may have > to be displayed to inform you what needs to be done. > > It may be much quicker, if the pieces of information are all grouped > in one control group, and the fields you need to fill in are in an adjacent > one. There may very likely be other data on the screen at the same time, > which don't relate directly to the job in hand. Sighted people visually > skip over that stuff, such as the box at the top giving the identification > summary of the person and their pay reference etc. They see that the boxes > they need to work with are all in two rectangles on the right of the screen, > one above the other, and so visually concentrate on those. They will glance > through the information in the first box, to identify the hours being > claimed, and will then click in the second box to place an insertion pointer > so they can type in the relevant charge codes. > > For a blind person to do this, they would need to have a quick way > to rapidly get to the information in the upper right box, and read it. > Then, to equally rapidly, move to the lower right box, in order to start > filling in the information. It is true that the fact that these boxes are > on the right of the screen may be of no significance whatsoever as far as > both the blind and sighted person are concerned. But the significance is > that they separate out the information that has to be dealt with, so that > the details on the left of the screen can be largely ignored unless > something special turns up. > > This, I think, is the kind of scenario that Will is talking about. > Not just the fact that address fields are grouped together, but that you may > need to perform specific, and isolated tasks on that group, separate from > the rest of the data on the screen. > > All the best, > > David > > -----Original Message----- > From: ossrp-control-bounce@xxxxxxxxxxxxx > [mailto:ossrp-control-bounce@xxxxxxxxxxxxx] On Behalf Of Jamal Mazrui > Sent: 05 May 2005 08:21 > To: ossrp-control@xxxxxxxxxxxxx > Subject: [ossrp-control] Re: What Is A Screen Reader? > > > Hi Will, > I agree that spatial relationships can and do convey a lot > of information to sighted users. I was not arguing that visual placement is > generally irrelevant, but maintain that it can be so for blind users where > it does not affect the interface we experience and the functionality of the > task at hand. > > For example, if the purpose of a dialog is to retrieve > typical contact information (name, address, phone, etc.) through a > well-understood set of fields, then it may be irrelevant to the blind user > where the controls are placed, as long as they speak properly. The layout > of the dialog is not an end in itself, but a means to an end, that of > gathering the data for a contact record. The database does not care, and > does not track, where the controls were placed in the input dialog that > gathered the data for the record that was saved. > > To elaborate, I might press tab successively hearing fields > like "First name", "last name," etc., filling each one in, including > reviewing the data in each edit box. If the tab order is logical and the > field name and current value speak as expected, than it does not matter to > me how the fields are aligned, what fonts are used for field names, what > point size the entered characters are, etc. > > Sighted users, on the other hand, are affected by such > characteristics. Even if the tab order is the same, logical sequence, they > will be confused if the "City" field is placed above the "First Name" field. > If a few fields are cramped together in one corner of the dialog in an > unpleasing manner aesthetically, their productivity will be reduced because > of the disorientation they experience, etc. > > Jamal > > -----Original Message----- > From: ossrp-control-bounce@xxxxxxxxxxxxx > [mailto:ossrp-control-bounce@xxxxxxxxxxxxx] On Behalf Of Will Pearson > Sent: Wednesday, May 04, 2005 4:25 PM > To: ossrp-control@xxxxxxxxxxxxx > Subject: [ossrp-control] Re: What Is A Screen Reader? > > > Hi Jamal, > > "As a blind user, placement > can actually be irrelevant, having no effect on > functionality." > > Based on psychology, semiotics and communications theory, I > would have to disagree with that statement. A control's relationship to > other controls and it's absolute positioning can be sources of semantic > information about that control's functionality. For example. buttons > grouped together may have similar functionality, buttons placed next to a > list box may perform an action on that list box or it's selected index. On > the web, a row of links placed in vertical alignment at the top of a page > are often used as a quick navigational group of links. > > So. spatial relationships and absolute positioning can add a > lot of meaning regarding functionality beyond that conveyed by a simple text > label. Users can, and often do, work out the full semantic nature of a > control, but this is often through trying out the control and seeing what it > does, which is inefficient at best, and possibly disasterous at worst, > imagine deleting something that you didn't actually mean to delete. > > Will > > ----- Original Message ----- > From: Jamal Mazrui <mailto:Jamal.Mazrui@xxxxxxx> > To: ossrp-control@xxxxxxxxxxxxx > Sent: Monday, May 02, 2005 2:41 PM > Subject: [ossrp-control] Re: What Is A Screen > Reader? > > Just an observation to share. > > In trying to program dialog boxes under Windows, I > have experienced the situation where something I developed worked well with > a screen reader, yet I subsequently discovered that it was almost unusable > for a sighted person. A screen reader can tab from one control to another, > and as long as each control is properly labeled and otherwise voicing as one > would expect at the time it has focus, then the controls in the dialog serve > their purpose. It may be the case, however, that the controls are placed in > visually peculiar, unbalanced, or overlapping places on the screen, thus > making the dialog difficult for a sighted user. > > As a blind developer, I need to know the location of > controls so that I can meet the needs of both sighted and blind users. As a > blind user, placement can actually be irrelevant, having no effect on > functionality. > > Regards, > Jamal > -----Original Message----- > From: ossrp-control-bounce@xxxxxxxxxxxxx > [mailto:ossrp-control-bounce@xxxxxxxxxxxxx] On Behalf Of Lyn Eagers > Sent: Saturday, April 30, 2005 11:19 PM > To: ossrp-control@xxxxxxxxxxxxx > Subject: [ossrp-control] Re: What Is A Screen > Reader? > > > Hi Will and Others, > > Will, I found your description of what a screen > reader is quite interesting. > > I train people to use screen readers and, from my > experience, some blind folk are interested in where things are on the screen > (spacial perception) and others are not. In particular, those who have had > sight and were extremely visual people find it important to know where > things are. Some, and I say some, so therefore not all, long term blind > people don't seem to be interested in the spacial factor. > > I am a long term blind person and have always tried > to grasp a mental picture of what is on the screen and where - probably > because I teach both kinds of blind people and sometimes assist sighted > folk. > > Anyhow, I thought I'd share my experiences with you. > > Cheers, > Lyn > > ----- Original Message ----- > From: Will Pearson > <mailto:will-pearson@xxxxxxxxxxxxx> > To: ossrp-control@xxxxxxxxxxxxx ; > uvip@xxxxxxxxxxxxxxx > Sent: Sunday, May 01, 2005 11:58 AM > Subject: [ossrp-control] What Is A Screen Reader? > > Hi, > > I thought I'd share my, rather academic, view of > what a screen reader is. It offers a little glimmer into what screen > readers could potentially do, and some of the pitfalls that the current crop > of screen readers have fallen into. All this is from the viewpoint of human > computer interaction, psychology and communications theory. > > OK. So, what is a screen reader? Well, it's > actually a lot more than people often assume it is. It's not just something > that grabs the text from the screen and reads it to you, well, at least it > shouldn't be, it is in fact the interface by which user and machine > communicate semantic meaning, relating to thoughts, concepts, actions and > states. > > So, how did I arrive at this view? As some of you > may know, I've been researching into semantics and their role in software > interfaces for a while now. During this time, it's become apparent that > software interfaces are just intended to communicate semantic meaning, but > as we're not capable of extr sensory perception and telepathy with the > computer, we need some way to indicate our thoughts, concepts, actions, etc. > to the computer, and vice versa. The way this is visually done is by > placing elements on the screen, such as icons, buttons, etc. and having > their shape, colour, position on screen and relationships to one another act > as encoding channels by which the semantic meaning is conveyed. Users then > just point to an object, conveying the semantics of which element they would > like to interact with, and either click it or select an action to perform on > it from a menu. All this is just a form of physical encoding of the > semantic meaning between user and machine and vice versa. > > So, as a screen reader is a replacement for the > visual interface, it's role is simply to act as an interface between user > and machine and convey the semantic meaning generated by the machine. > However, there's a nasty twist, and that is that a screen reader has to get > the semantic meaning that it is to communicate to the user from somewhere. > As the screen reader has no access to the internals of the machine, it's > only available source of semantics that the machine wishes to convey is the > visual interface, which uses encoding techniques such as colour, shape, > position and spatial relationship to convey it's semantics. So, a screen > reader should really be about extracting the semantics from the visual > display and encoding them in a non-visual form suitable for a blind user, > and this is where current screen readers fall down. To maintain accurate > and efficient communication with the user, all the semantics that are > conveyed visually need to be conveyed to the user. This includes things > like spatial positioning and spatial relationships between interface > elements, things that are currently lost to the user when they are using one > of the current screen readers. If this were to happen, then the number of > errors, and according back-tracks and reissuing of commands that go along > with errors, would decrease, and screen reader users would be more efficient > beasts. > > I haven't gone into design specifics, as they're for > another day, and these can dramatically affect efficiency as well, but > that's my thoughts of what a screen reader should be doing. In focusing on > the semantics, then it's likely that through the use of semantic translation > access to all those difficult accessibility problems could be increased. > > Will > > > > To post to the list, send a message to: > ossrp-control@xxxxxxxxxxxxx > To unsubscribe, send a message to: > ossrp-control-request@xxxxxxxxxxxxx > and set the subject field of the message to "unsubscribe" (without the quotes > > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. To post to the list, send a message to: ossrp-control@xxxxxxxxxxxxx To unsubscribe, send a message to: ossrp-control-request@xxxxxxxxxxxxx and set the subject field of the message to "unsubscribe" (without the quotes