AT UI (22nd August 2008)

Aaron Leventhal and I were in Mozilla’s #accessibility channel this morning. We got to talking about how screen reader’s expose their features to end users. As their users normally cannot use a pointing device, their user interface is based on keyboard shortcuts.


There are typically 50+ commands in modern screen readers, several modes and other complexities. Consider a GUI with 50+ controls and all these complexities. I suggest this is too much.

Aaron has seen these features aid power users. He agrees it can be too much for new users. Horses for courses; there’s space for both models to compete.


Aaron described how having more commands reduces keystrokes. He gave specific examples of where modes in the screen reader disambiguate commands. Let’s say T is used as a shortcut to move to the next textbox. When focus is in a text input, should T move to next textbox or type T? By letting users switch in and out of a forms mode, what it does is always predictable.

You must know the shortcut to switch modes, been trained on how each mode differs and have months or years of experience to understand when each mode is of practical use. I suggest this is false economy.


I suggest the same model for sighted keyboard users would work. So you’d use Tab and Shift+Tab to move between form controls, rather than having T and Shift+T for moving specifically between textboxes.

Aaron asks how I’d feel if I could only navigate with Tab and Shift+Tab. “My keyboard would get worn out,” I quip. “I wasn’t suggesting they go from 50+ commands to just 2.”

We go back and forth about more keystrokes versus more commands for a while.

Smart Minds

Screen readers have been around for a long time. The people who design their interfaces don’t do it in a vaccum. They have years or even decades of experience, observing tests and mailing lists and direct feedback from users over their careers. Aaron touches on this several times.

My objection is not to the overall design of screen readers. I am not questioning the expertise of the people who make them, either.

Winning Designs

The read-through toggle (aka pass-through mode) seems eminently sensible to me. From a simple one-touch shortcut, such as Ctrl, reading begins. It pauses if Ctrl is pressed once. It stops upon reaching the end of that window. This works in all modes, all contexts.

Convenient and dependable features like this are what I think enrich an interface.

A sighted keyboard user need not use a shortcut to change the input mode of their keyboard before they can put focus into a form control and interact with it. The application, the operating system and the specific control work together to make the interaction seamless.

Once in the control, a few keys change to operate within the control. For example, End now moves the input caret to the end of a textbox rather than scrolling to the bottom of a viewport. When interacting with forms, Enter submits the form. Whem focus is on a link, Enter activated the link. In a <textarea> it types a new line.

All this Just Works; the interface adapts seamlessly and intuitively to the user’s context.


Modeless Interaction

Working with tables, lists and forms seamlessly should be the goal for screen readers. Just like sighted keyboard access disambiguates commands intuitively based on the user’s context. With enough imagination and sophistication, surely that’s possible?

List of Things

There are less dramatic ways which I think usability could be improved. I’ve seen ATs where several windows are available, each listing a different type of interesting structure. Each has a different shortcut and may work radically differently from each other.

I would just provide one window. A List of Things window, if you will. This immediately reduces the number of shortcuts which must be crammed into the ever-narrowing pallette of combinations available for ATs.

The window would provide one tab for each type of list. Each list would automatically select the item at or just before where the user’s focus was. Each list would work as similarly as possible. There would be a “find as you type” box for each one, so users could search within each list. Preferences for searching, sorting and filtering would be remembered.


An often overlooked but frequent problem is features which don’t work quite perfectly in every possible scenario. Noticing when the off-screen buffer has got out of synchronisation with the screen and taking steps to compensate is something I might expect a tactical officer to notice during a battle on Star Trek. It’s not something a casual user should need to understand.

ATs are often hampered in this respect by other people’s mistakes. We’re all human but if an application is doing something wrong, an AT gets put in a difficult situation. Nobody wins and users lose out.

Imagine a concerted effort by the ICT industry to fix accessibility bugs wherever they are found. Imagine the productivity and happiness of users for whom everything works properly all of the time.

of course, this is a monumental and ongoing task which has been underway in one form or another for decades. But there are a finite number of variables. Each has a finite range. The task is not quite endless, afaict.

Old News

No doubt, Aaron and AT vendors have heard this all before. It’s probably been tried before, too.

Aaron points out that the design each product uses has evolved through feedback from its users. They continue to make changes but for the most part, what they’ve got is what works best. Hanging around in #accessibility, I see first-hand that identifying and fixing accessibility bugs is happening all the time.

I appreciate it’s a massive area. I appreciate there are many balance points. But my feeling is there remains room for great improvements. Things I can’t even imagine.

26th August 2008

Another aspect is what features an author makes available from their documents and in what way they do this. For example, a discussion about linking <video> to its transcripts revealed different ways (some hypothetical, for now) of doing this:

My preference would be an <a href> immediately after the <video>. Another possibility would be having the transcript on the same page the <video> is on.

Each technique would have a different user experience. Simple and reliable authoring is the key, imho. That’s the only way a feature can work for anyone. That’s the starting point to make it work for everyone.