The team and timeline
This research took approximately 6 weeks in total, including presenting results and writing a research report.
Context
Microsoft Speech Service is a part of Cognitive Services or AI services which offer AI solutions to developers, without requiring machine learning experience.
Speech Service documentation is dense with information developers need to deploy services. It needs to be functional, easy to navigate and lead the user to the right information.
Goals
Set the benchmark
Make it easier for developers to find information they need when using Speech documentation
Method - Iterative tree testing
Tree tests evaluate a hierarchical category structure. This method allowed us to evaluate how easy is to navigate table of contents for given tasks and measure the impact of the improvements made before release.
Research plan
Assess the table of contents and set
the benchmark
Discover insights
Iterate and test again to examine the impact of the improvements we made
Outcomes
Moved pricing, which was very difficult to find, reducing time on task by 63%
Removed redundant labels and standardized wording for region support, reducing time on task by 65%
Significantly improved task success on 8 out of 12 tasks
Identified potential content changes
Setting the benchmark
The Speech Service table of contents has not been previously assessed so we needed to set the benchmark.
In order to do this, we developed 12 tasks which would mimic typical use of the Speech Service. Participants were given a short scenario and asked to indicate where they would expect to find the information needed to complete the task.
Examples of tasks:
Converting speech to text
Recognizing speakers
Choosing programming language
Using voice assistance
Finding a relevant region
For each task we noted task success (how many users were successful in clicking the relevant navigation label) and time on task, as well as confidence intervals for each task. Users were also asked to indicate which labels were confusing and why. This resulted in several insights.
Key insights and impact
Navigation could be condensed
Many users commented that table of contents could be condensed by removing or combining labels. Streamlining the navigation allowed us to improve task success rates on 8 out of 12 tasks and lower the time on task across almost all of the tasks. Users can now find the relevant information quicker.
Users struggled to find pricing for the services they needed
Pricing was difficult to find, with many users taking a long time to find it, resulting in frustration. Moving pricing from Resources to Overview resulted in 63% reduction in time on task, improving overall user experience.
Inconsistent wording for Region support, caused confusion
‘Region support’ content was also named ‘Regions’ in the table of contents, despite both labels leading to the same information, leading to confusion.
Removing the redundant labels and standardizing the wording, led to 65% reduction in time it takes to locate information pertaining to regions.
These insights and fixes can be applied to other cognitive services
Some tasks performed poorly across both studies.
We noticed that some tasks performed poorly in both studies and started brainstorming why this may be.
We considered several things:
Are scenarios clear to users?
Since these were unmoderated user tests, it may be that users did not fully understand what they needed to do. Are tasks worded in a way that is easily understandable?
Are category labels doing a good job of leading users to the right information?
Perhaps users understood the task but were not sure where they
would find the relevant information.
Next steps - Iterate, test, repeat
There is more work to be done. Although we improved task success across 8 out of 12 tasks, some tasks can still be improved.
Reword the tasks performing badly to see if poor performance is down to wording and understanding
Implement recommendations on rewording category labels to make it easier for users to understand where to go for relevant information
Reflections
Although the method allowed us to explore what needed to be improved across the table of contents and see if these improvements are having a positive impact on the potential users, there are some things which need to be considered going forward:
Making changes to table of contents can have knock on effects not only on labels but where content is located - so more consideration will be needed going forward
Figuring out why some tasks perform badly is not easy - potentially we may need to consider switching to moderated tests, in order to explore reasons for poor performance
And finally, collaboration with the team on this project led to deeper insights. We also noted possible content changes that could help users navigate the documentation.