Skip to main content

Speech Recognition for Learning

Speech recognition, also referred to as speech-to-text or voice recognition, is technology that recognizes speech, allowing voice to serve as the “main interface between the human and the computer.” This Info Brief discusses how current speech recognition technology facilitates student learning, as well as how the technology can develop to advance learning in the future.

On this page:

Speech recognition, also referred to as speech-to-text or voice recognition, is technology that recognizes speech, allowing voice to serve as the “main interface between the human and the computer”1. This Info Brief discusses how current speech recognition technology facilitates student learning, as well as how the technology can develop to advance learning in the future.

Although speech recognition has a potential benefit for students with physical disabilities and severe learning disabilities, the technology has been inconsistently implemented in the classroom over the years. As the technology continues to improve, however, many of the issues are being addressed. If you haven’t used speech recognition with your students lately, it may be time to take another look. Both Microsoft and Apple have built speech recognition capabilities into their operating systems, so you can easily try out these features with your students to find out whether speech recognition might be right for them.

Speech recognition vs. speech-to-text: what's the difference?

When researching speech recognition tools for your child or your classroom, you may variously see technologies referred to as “speech-to-text,” “voice recognition,” or “speech recognition,” sometimes all within the same product description. Though the terms can be confusing, they all refer to technologies that can translate spoken language into digitized text or turn spoken commands into actions (i.e., “open Microsoft Word”). Voice recognition can refer to products that need to be trained to recognize a specific voice (such as Dragon Naturally Speaking), or those products used in applications like automated call centers that are capable of recognizing a limited vocabulary from any user. Quite frequently, as in this article, the terms speech recognition and voice recognition are used interchangeably.

Speech recognition technology in everyday life

Speech recognition and speech-to-text programs have a number of applications for users with and without disabilities. Speech-to-text has been used to help struggling writers boost their writing production2 and to provide alternate access to a computer for individuals with physical impairments3. Other applications include speech recognition for foreign language learning,4 voice activated products for the blind5, and many familiar mainstream technologies.

New developments in the technology have driven innovation in many familiar customer service industry applications. We have all used voice recognition technologies in our daily lives, many times without even thinking about it: automated phone menus and directories, voice activated dialing on our cell phones, and integrated voice commands on Smartphones are just a few examples6. Medical and law professionals use voice recognition every day to dictate notes and transcribe important information. Newer uses of the technology include military applications, navigation systems, automotive speech recognition (Ford SYNC(opens in a new window)), ‘smart’ homes designed with voice command devices, and video games such as EndWar(opens in a new window), which allows the player to give orders to their troops using only their voice.

Benefits of speech recognition for struggling writers

Populations that may benefit from speech recognition technologies for learning include users with:

  • Learning disabilities, including dyslexia and dysgraphia
  • Repetitive strain injuries, such as carpal tunnel syndrome
  • Poor or limited motor skills
  • Vision impairments
  • Physical disabilities
  • Limited English Language7

Benefits for students with disabilities may include improved access to the computer, increases in writing production, improvements in writing mechanics, increased independence, decreased anxiety around writing, and improvements in core reading and writing abilities.

Improved access

For students with motor skill limitations, physical disabilities, blindness/low vision, or other difficulties accessing a standard keyboard and mouse, hands-free computing through the use of speech recognition technologies may be beneficial. By removing the physical barriers to writing and navigation of the computer, you can increase student access to technology and classroom activities.

Writing production

For students with learning disabilities, speech recognition technology can encourage writing that is more thoughtful and deliberate8. Studies with middle and high school students with learning disabilities have shown that input via speech is less challenging and that students frequently generate papers that are longer and better quality using speech recognition technologies9.

Mechanics of writing

Speech recognition technologies, in conjunction with word processors’ abilities, can help reduce some of the difficulties that students may face with writing mechanics. Because students can often write more quickly with speech recognition tools, it eliminates potential obstacles, such as difficulty with handwriting or the need to transcribe thoughts while brainstorming. Often, writers with learning disabilities will skip over words when they are unsure of the correct spelling, leading to pieces of writing that are short, missing key elements, or not reflective of the student’s true abilities10. Speech recognition and word processors can potentially alleviate some of these concerns by allowing the student to get their thoughts out on paper without worrying about these or other technical writing components11.

Increased independence

For students with physical disabilities, poor motor skills or learning disabilities, a human transcriber is a low-tech solution for the classroom that allows the focus to shift from the physical act of writing to expressing thoughts and knowledge. However, a transcriber makes the student dependent upon a teacher or aide for writing tasks. Students who use transcribers for writing often report “spending less time planning and organizing because they felt they were keeping the transcriber waiting, or felt embarrassment about making mistakes or asking for multiple readings of what was written12.” Using speech-to-text tools can allow the student to be more independent in their writing and other academic activities. If the speech-to-text program also includes text-to-speech features, the student may hear their text read aloud to them multiple times, and correct their errors more independently.

Decreased anxiety

In addition to allowing the student to work in a more independent manner, speech recognition can allow students to write without fear of spelling errors, helping them avoid the anxieties associated with mechanics, organization, and editing13; many struggling writers feel embarrassment about “the appearance of their writing due to brevity of sentence or paragraph length, illegibility of handwriting, and/or misspelled words14.”

For students who are English Language Learners, or are learning a second language, speech recognition programs can allow them to practice pronunciation in a safe, low-stress environment. Students can engage in multiple repetitions of an unfamiliar word without worrying about feeling embarrassed15. Some popular foreign language software programs now include speech recognition features for just this purpose.

Improvements in core reading and writing abilities

Research has shown that speech recognition tools can also serve a remedial function for students with learning disabilities in the areas of reading and writing. In allowing students to see the words on screen as they dictate, students can gain insight into important elements of phonemic awareness, such as sound-symbol correspondence. As students speak and see their words appear on the screen, the speech-to-text tool directly demonstrates the relationship between how a word looks and sounds16. This bimodal presentation of text can be especially helpful for students with learning disabilities, and is thought to be why speech recognition has been found effective in remediating reading and spelling deficits.

Another key benefit of speech recognition technologies is the error correction process. Because no speech recognition product is completely accurate, “it requires users to check the accuracy of each word uttered as sentences are being dictated. When an error is made, the child must then find the correct word among a list of similar words and choose it”17. This process necessitates that the user examine the word list closely, compare words that look or sound alike, and make decisions about the best word for the specific situation. This can give kids with LD a boost in reading and spelling as they learn to discriminate between similar words18.

Challenges

Despite advances over the past 20 years, speech recognition technology as it is today still presents challenges for students with disabilities. As with any new technology tool, students must initially become comfortable with using speech-to-text, including training it to recognize their voices, gaining experience with a new way of writing, understanding the differences between writing and speaking, and correcting errors within the text. For students with learning disabilities, struggling readers and writers, or very young students, this may induce additional frustrations with the writing process. Though the software has improved, speech-to-text programs are not always capable of recognizing the voices of young children, so students must adjust to speaking more slowly so that the technology can more accurately transcribe their thoughts19.

Because speaking to write is an activity that requires different skills than speaking in conversation, students must be aware of the differences between the two. This may be challenging for early writers who have not yet made that distinction. Using speech recognition technology may make it more difficult for younger students to begin differentiating between writing and speaking. Thus, it is critical that use of speech recognition technology be paired with instruction on writing strategies, brainstorming, drafting and organization20.

Another key element involved in using speech recognition programs is the need for error correction and monitoring of misrecognized words. Newer programs never make a spelling mistake and they improve when users correct misrecognized words, so students must be alert for errors that go unrecognized by the program (e.g., incorrect word choices, or words misunderstood by the software). While this process can be taxing for struggling readers, a program that is also capable of reading text back to the user can help them with editing and revising.

Another implementation challenge is that the software requires a good deal of memory and must be saved on a single server folder. These voice files improve in accuracy with use, so it is important that students work in their own saved file. This means that this assistive technology is not always portable. Schools have overcome this challenge by assigning students laptops with the software installed or storing files on a networked server that can be accessed from anywhere on campus.

As with any assistive technology solution, finding funding for speech recognition solutions may also present a challenge for schools. The first step in obtaining any assistive technology for your students is to conduct a thorough assessment to determine what would best meet the student’s learning needs. It may be that because of the various implementation challenges listed above, speech recognition software would not be the best fit for certain students. Once a potentially beneficial solution is agreed upon, there are many options for schools looking for funding for AT, from grant programs, to used AT marketplaces, to loan programs from vendors and assistive technology centers.

Improving student success with speech recognition

Speech recognition isn’t perfect and may not be the best choice for all students with disabilities, but it does have some significant benefits for certain students that make it worth the time investment. If speech recognition tools are right for your student, here are several tips for improving student success:

  • Be sure that your computer has a good quality microphone and sound card, and meets the minimum memory capacity and processing speed requirements listed for the software you purchase. Many speech recognition companies will recommend specific microphones that have been shown to work well with their software.
  • For students with LD using speech recognition, explicit instruction in reading skills, phonological awareness, writing strategies and organizational strategies may be helpful.
  • For students who struggle with reading, picking out software that includes a read-back or text-to-speech feature can help with error correction and editing.

The future of speech recognition

More research is still needed on the efficacy of speech recognition for children with LD and other types of disabilities. However, the technology is continuing to move forward and address many of the problems encountered before. For example, many newer versions of speech recognition software now include voice profiles for children, meaning that they are becoming more accurate at distinguishing words spoken by younger users.

As industries(opens in a new window) begin to use some elements of voice recognition technology in their day-to-day work (military, medical, legal)21, it makes sense for students to gain some familiarity with speech recognition. Some technologies initially designed for users with disabilities have seen transitions into mainstream technology, becoming something that we all come to rely on in our daily lives22. Because of this, technology industry leaders are beginning to believe that all students should receive a technology education that reflects the future of human-computer interactions, which they predict will be primarily through voice and touch.

A “Tech Works” brief from the National Center for Technology Innovation (NCTI)(opens in a new window). August 2010.