Nuance’s Document Imaging Division is now part of Kofax. Learn more
Approximately 6 percent of Swedes struggle with dyslexia, including Swedish software developer Tor Ghai. For most of his life, he struggled with reading comprehension as he battled his way through required textbooks. As an adult, he searched for a way to make reading easier – and even enjoyable – so that he could stay on top of the latest in software development and maybe even read for fun. After developing a program that would read text to sight-impaired people and becoming familiar with text-to-speech engines, Tor began to develop a version for dyslexics like himself.
Accuracy is paramount for reading
However, finding an OCR engine to work with images of text was harder than it sounded, especially when it came to pairing the OCR engine with a voice engine that would speak the text to the user. While plenty of solutions exist for both, most of the vocalizer solutions would mangle approximately 20 percent of the words on a page – not terrible if you’re only reading one page, but cumbersome if you’re trying to read along with the audio in a multi-page document or a book. Misspeaking the word could derail the user’s train of thought. And, sometimes, that missing 20 percent of information can mean the difference between understanding what you just read and being utterly confused. Additionally, most OCR products require high-quality images, which isn’t always the case in practice. Building an OCR engine was out of the question, as it would take a lot of time.
Of all the products tested, Nuance OmniPage Capture SDK and Nuance Vocalizer had the highest rates of accuracy for recognizing, processing, and speaking text. Combining the two would make it possible for TorTalk users to get the most accurate readings. While the product will sometimes miss a word, the flow is smoother for those reading along with the text.
Additionally, the Nuance OmniPage Capture SDK provides real-time OCR capabilities. In one to two seconds, TorTalk activates the SDK to process the text, then read it to the user. The user simply clicks a button, and the program starts reading the text, whether it’s a single page or an e-book.
Nuance support eases development
Tor wanted to ensure TorTalk would work exactly as planned. He had a lot of questions for Nuance support, probably more questions than the average developer, he admitted. However, Nuance, and our London representative in particular, was very responsive to questions and provided him with the information he needed. Tor attended the OmniPage Capture SDK Developer’s Forum in Palma de Mallorca this past May and was able to learn more about SDK and how he can develop with it, as well as network with other developers and share ideas.
Capturing the university market
With Nuance OmniPage Capture SDK as its OCR backbone, and Nuance Vocalizer as the text-to-speech engine, Swedish universities quickly became interested in using TorTalk for its students. Tor had originally developed TorTalk for Windows operating systems, but with universities clamoring at his door for a Mac version, he found that, with the Nuance OmniPage Capture SDK, he could just move most of the original code to a Mac solution.
Today, 75 percent of Swedish universities, including Goteborgs Universitet and Uppsala Universitet, use TorTalk for their students. In the most common case, a student will load a PDF or ebook onto a computer screen, then place the TorTalk window around the portion to be read. If it’s electronic text, the student will highlight the text and press play. The text to speech engine then reads the text to the student. The biggest value is in the accuracy and speed, which helps students regain confidence in their ability to learn – something that Tor himself felt as he used TorTalk.