p. 133-134
Clark and Mayer advise against providing redundant on-screen text at the same time that graphics (video) and narration are provided. They base their recommendation on both research and theory. And they provide two reasons before getting into the details: 1) learners reading on-screen text might not attend to graphics and 2) learners may try to reconcile on-screen text and audio narration and engage in extraneous processing defined below (p. 459)[emphasis added]:
Irrelevant mental work during learning that results from ineffective instructional design of the lesson. For example, a graphic appears at the top of a scrolling screen and text explaining the graphic appears at the bottom so that contiguity is violated.
But what if recognizing words and phrases accurately becomes a key component of comprehending a graphic or a video-recorded presentation? And what if the combination of audio narration and on-screen text can be used to support that that comprehension?
There are some interesting studies in second language learning that seem to show similar benefits.
Gass, S., Winke, P., Isbell, D. R., & Ahn, J. (2019). How captions help people learn languages: A working-memory, eye-tracking study. Language Learning & Technology, 23(2), 84–104. https://doi.org/10125/44684
Mayer, R. E., Lee, H., & Peebles, A. (2014). Multimedia Learning in a Second Language: A Cognitive Load Perspective. Applied Cognitive Psychology, 28(5), 653–660. https://doi.org/10.1002/acp.3050
Winke, P., Gass, S., & Sydorenko, T. (2010). The Effects of Captioning Videos Used for Foreign Language Listening Activities. Language Learning & Technology, 14(1), 65–86. http://dx.doi.org/10125/44203