Title: Learning outcomes with text-to-speech synthesis of native and non-native English varieties
Source document: Brno studies in English. 2024, vol. 50, iss. 2, pp. 7-33
Extent
7-33
-
ISSN0524-6881 (print)1805-0867 (online)
Persistent identifier (DOI): https://doi.org/10.5817/BSE2024-2-1
Stable URL (handle): https://hdl.handle.net/11222.digilib/digilib.82125
Type: Article
Language
English
License: CC BY-NC-ND 4.0 International
Rights access
open access
Notice: These citations are automatically created and might not follow citation rules properly.
Abstract(s)
English is a common instruction medium of learning tools using Text-to-Speech (TTS), yet most solutions incorporate only L1 varieties like Standard American English (AmE). At the same time, some research suggests that educational content personalized in the learner's variety is beneficial. We tested this hypothesis with students from the Masaryk University Brno, who listened to a lecture synthesized in Czech English (CzE) and AmE, rated the TTS speaker based on the Robotic Social Attributes Scale and answered questions regarding the contents of the lecture. The learners improved their knowledge similarly with both TTS varieties. Characteristics from the intelligence cluster were rated higher than anthropomorphism and likability, and the AmE voice was rated more competent than the CzE voice. While the results indicate that the narration may need to be made more engaging, the present study provides first insights into the perceptions of L2 students' own variety and recommendations for customizing credible, learning-facilitating TTS.
Note
We thank the team of the D03 subproject of the Collaborative Research Center "Hybrid Societies" at the Chemnitz University of Technology for their contributions in the development of the TTS system and their inputs on the learning experiment. This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – SFB 1410, TP D03. This work was also supported by a PhD scholarship awarded to MI by the State of Saxony.
References
[1] Abdulrahman, Amal and Richards, Deborah (2022) Is natural necessary? Human voice versus synthetic voice for intelligent virtual agents. Multimodal Technologies and Interaction 6 (7), 51. https://doi.org/10.3390/mti6070051
[2] Ahn, Jeahyeon and Moore, David (2011) The relationship between students' accent perception and accented voice instructions and its effect on students' achievement in an interactive multimedia environment. Journal of Educational Multimedia and Hypermedia 20 (4), 319–335.
[3] Andrist, Sean, Ziadee, Micheline, Boukaram, Halim, Mutlu, Bilge and Sakr, Majd (2015) Effects of culture on the credibility of robot speech: A comparison between English and Arabic. In Proceedings of the tenth annual ACM/IEEE international conference on human-robot interaction. https://doi.org/10.1145/2696454.2696464
[4] Atkinson, Robert K., Mayer, Richard E. and Merrill, Mary M. (2005) Fostering social agency in multimedia learning: Examining the impact of an animated agent's voice. Contemporary Educational Psychology, 30 (1), 117–139. https://doi.org/10.1016/j.cedpsych.2004.07.001
[5] Bansal, Shivam and Aggarwal, Chaitanya (2022) textstat (Version 0.7.3) [Computer software]. https://pypi.org/project/textstat/
[6] Bartneck, Christoph, Kulić, Dana, Croft, Elizabeth and Zoghbi, Susana (2009) Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics 1 (1), 71–81. https://doi.org/10.1007/s12369-008-0001-3
[7] Beege, Maik, Schneider, Sascha, Nebel, Steve, Mittangk, Jessica and Rey, Günter Daniel (2017) Ageism – Age coherence within learning material fosters learning. Computers in Human Behavior, 75, 510–519. https://doi.org/10.1016/j.chb.2017.05.042
[8] Beege, Maik, Schneider, Sascha, Nebel, Steve, and Rey, Günter Daniel (2020) Does the effect of enthusiasm in a pedagogical Agent's voice depend on mental load in the Learner's working memory? Computers in Human Behavior, 112, 106483. https://doi.org/10.1016/j.chb.2020.106483
[9] Bent, Tessa and Bradlow, Ann R. (2003) The interlanguage speech intelligibility benefit. Journal of the Acoustic Society of America 114 (3), 1600–1610. https://doi.org/10.1121/1.1603234
[10] Bione, Tiago and Cardoso, Walcir (2020) Synthetic voices in the foreign language context. Language Learning and Technology 24 (1), 169–186. https://doi.org/10125/44715
[11] Boduch-Grabka, Katarzyna and Lev-Ari, Shiri (2021) Exposing individuals to foreign accent increases their trust in what nonnative speakers say. Cognitive Science 45 (11), e13064. https://+ doi.org/10.1111/cogs.13064
[12] Brabcová, Kateřina and Skarnitzl, Radek (2018) Foreign or native-like? The attitudes of Czech EFL learners towards accents of English and their use as pronunciation models. Studie Z Aplikované Lingvistiky 1 (38-50).
[13] Brom, Cyril, Hannemann, Tereza, Stárková, Tereza, Bromová, Edita and Děchtěrenko, Filip (2017) The role of cultural background in the personalization principle: Five experiments with Czech learners. Computers & Education, 112, 37–68. https://doi.org/10.1016/j.compedu.2017.01.001
[14] Brown, Penelope and Levinson, Stephen C. (1987) Politeness: Some universals in language usage. Cambridge: Cambridge University Press.
[15] Buck, Gary (2007) Assessing listening (7. print). The Cambridge language assessment series. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511732959
[16] Carpinella, Colleen M., Wyman, Alisa B., Perez, Michael A. and Stroessner, Steven J. (2017, March). The robotic social attributes scale (RoSAS) development and validation. In Proceedings of the 2017 ACM/IEEE International Conference on human-robot interaction (pp. 254–262).
[17] Castro-Alonso, Juan C., Wong, Rachel M., Adesope, Olusola O. and Paas, Fred (2021) Effectiveness of multimedia pedagogical agents predicted by diverse theories: A meta-analysis. Educational Psychology Review 33 (3), 989–1015. https://doi.org/10.1007/s10648-020-09587-1
[18] Council of Europe (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press.
[19] Council of Europe (2020) Common European Framework of Reference for Languages: Learning, teaching, assessment – Companion volume. Council of Europe Publishing.
[20] Craig, Scotty D. and Schroeder, Noah L. (2017) Reconsidering the voice effect when learning from a virtual human. Computers & Education, 114, 193–205. https://doi.org/10.1016/j.compedu.2017.07.003
[21] Craig, Scotty D. and Schroeder, Noah L. (2019) Text-to-Speech software and learning: Investigating the relevancy of the voice effect. Journal of Educational Computing Research 57 (6), 1534–1548. https://doi.org/10.1177/0735633118802877
[22] Cristia, Alejandrina, Seidl, Amanda, Vaughn, Charlotte, Schmale, Rachel, Bradlow, Ann and Floccia, Caroline (2012) Linguistic processing of accented speech across the lifespan. Frontiers in Psychology, 3, Article 479, 1–15. https://doi.org/10.3389/fpsyg.2012.00479
[23] Dahlbäck, Nils, Swamy, Seema, Nass, Clifford, Arvidsson, Fredrik and Skågeby, Jörgen (2001) Spoken interaction with computers in a native or non-native language - same or different? In Proceedings of INTERACT. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=58a34aa5fd752e29e434cbf9e9d72891715a2d8e
[24] Dahlbäck, Nils, Wang, QuanYing, Nass, Clifford and Alwin, Jenny (2007) Similarity is more important than expertise: Accent effects in speech interfaces. In CHI 2007 Proceedings, San Jose, CA, USA. https://doi.org/10.1145/1240624.1240859
[25] Dai, Laduona, Jung, Merel M., Postma, Marie and Louwerse, Max M. (2022) A systematic review of pedagogical agent research: Similarities, differences and unexplored aspects. Computers & Education, 190, Article 104607, 1–28. https://doi.org/10.1016/j.compedu.2022.104607
[26] Dalsgaard, Christian (2005) Pedagogical quality in e-learning. Eleed(1). https://www.eleed.de/archive/1/78
[27] Davis, Robert O., Vincent, Joseph and Park, Taejung (2019) Reconsidering the voice principle with non-native language speakers. Computers & Education, 140, Article 103605, 1–12. https://doi.org/10.1016/j.compedu.2019.103605
[28] Défossez, Alexander, Synnaeve, Gabriel and Adi, Yossi (2022) denoiser (Version 0.1.4) [Computer software]. facebookresearch. https://github.com/facebookresearch/denoiser
[29] Delibegović Džanić, Nihada and Berberović, Sanja (2021) Lemons and watermelons: Visual advertising and conceptual blending. In Larissa D'Angelo, Anna Mauranen and Stefania Maci (Eds.), Metadiscourse in digital communication (pp. 115–132) Springer. https://doi.org/10.1007/978-3-030-85814-8_6
[30] Do, Tiffany D., Akter, Mamtaj, Choudhary, Zubin, Azevedo, Roger and McMahan, Ryan P. (2022) The effects of an embodied Pedagogical Agent's synthetic speech accent on learning outcomes. In International Conference on Multimodal Interaction (pp. 198–206). ACM. https://doi.org/10.1145/3536221.3556587
[31] Dokovova, Marie, Scobbie, James M. and Lickley, Robin (2022) Matched-accent processing: Bulgarian-English bilinguals do not have a processing advantage with Bulgarian-accented English over native English speech. Laboratory Phonology 13 (1), Article 12, 1–40. https://doi.org/10.16995/labphon.6423
[32] Drager, Katie (2010) Sociophonetic variation in speech perception. Language and Linguistics Compass 4 (7), 473–480. https://doi.org/10.1111/j.1749-818X.2010.00210.x
[33] Ehret, Jonathan, Bönsch, Andrea, Aspöck, Lukas, Röhr, Christine T., Baumann, Stefan, Grice, Martine, Fels, Janina and Kuhlen, Torsten W. (2021) Do prosody and embodiment influence the perceived naturalness of conversational agents' speech? ACM Transactions on Applied Perception 18 (4), 1–15. https://doi.org/10.1145/3486580
[34] Fishero, Sheyenne, Sereno, Joan A. and Jongman, Allard (2023) Perception and production of Mandarin-Accented English: The effect of degree of Accentedness on the Interlanguage Speech Intelligibility Benefit for Listeners (ISIB-L) and Talkers (ISIB-T). Journal of Phonetics, 99, Article 101255. https://doi.org/10.1016/j.wocn.2023.101255
[35] Gill, Mary M. (1994) Accent and stereotypes: Their effect on perceptions of teachers and lecture comprehension. Journal of Applied Communication Research, 22, 348–361.
[36] Gosselin, Leah, Martin, Clara D., Martín, Ana González and Caffarra, Sendy (2022) When a nonnative accent lets you spot all the errors: Examining the syntactic interlanguage benefit. Journal of Cognitive Neuroscience 34 (9), 1650–1669 https://doi.org/10.1162/jocn_a_01886
[37] Hanzlíková, Dagmar and Skarnitzl, Radek (2017) Credibility of native and non-native speakers of English revisited: Do non-native listeners feel the same? Research in Language 15 (3), 285–298. https://doi.org/10.1515/rela-2017-0016
[38] Hayes-Harb, Rachel, Smith, Bruce L., Bent, Tessa and Bradlow, Ann R. (2008) The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts. Journal of Phonetics 36 (4), 664–679. https://doi.org/10.1016/j.wocn.2008.04.002
[39] Holliday, Nicole (2023) Siri, you've changed! acoustic properties and racialized judgments of voice assistants. Frontiers in Communication, 8. https://doi.org/10.3389/fcomm.2023.1116955
[40] Ivanova, Marina and Schmied, Josef (2023) From cues to features: Bridging psycho- and sociolinguistics in the development of non-native English stimuli. TESOL Communications 2 (2), 1–17. https://doi.org/10.58304/tc.20230201
[41] Jimenez-Molina, Angel, Retamal, Cristian and Lira, Hernan (2018) Using psychophysiological sensors to assess mental workload during web browsing. Sensors 18 (2). https://doi.org/10.3390/s18020458
[42] Karakaş, Ali (2017) English voices in 'Text-to-speech tools': Representation of English users and their varieties from a World Englishes perspective. Advances in Language and Literary Studies 8 (5), 108. https://doi.org/10.7575/aiac.alls.v.8n.5p.108
[43] Kartal, Günizi (2010) Does language matter in multimedia learning? Personalization principle revisited. Journal of Educational Psychology 102 (3), 615–624. https://doi.org/10.1037/a0019345
[44] Kiczkowiak, Marek (2018) Native Speakerism in English Language Teaching: Voices from Poland. PhD thesis, University of York. https://etheses.whiterose.ac.uk/id/eprint/20985/
[45] Krenn, Brigitte, Schreitter, Stephanie and Neubarth, Friedrich (2017) Speak to me and I tell you who you are! A language-attitude study in a cultural-heritage application. AI & SOCIETY 32 (1), 65–77. https://doi.org/10.1007/s00146-014-0569-0
[46] Labov, William (1994) Principles of linguistic change: Internal factors. Oxford: Blackwell.
[47] Lev-Ari, Shiri and Keysar, Boaz (2010) Why don't we believe non-native speakers? The influence of accent on credibility. Journal of Experimental Social Psychology 46 (6), 1093–1096. https://doi.org/10.1016/j.jesp.2010.05.025
[48] Lin, Lijia, Ginns, Paul, Wang, Tianhui and Zhang, Peilin (2020) Using a pedagogical agent to deliver conversational style instruction: What benefits can you obtain? Computers & Education, 143, Article 103658, 1–11. https://doi.org/10.1016/j.compedu.2019.103658
[49] Louwerse, Max M., Graesser, Arthur C., McNamara, Danielle S. and Lu, Shulan (2009) Embodied conversational agents as conversational partners. Applied Cognitive Psychology 23 (1244-1255).
[50] Maes, Pattie (1994) Agents that reduce work and information overload. Communications of the ACM 37 (7), 31–40. https://doi.org/10.1007/SpringerReference_85143
[51] Major, Roy C., Fitzmaurice, Susan F., Bunta, Ferenc and Balasubramanian, Chandrika (2002) The Effects of nonnative accents on listening comprehension: Implications for ESL assessment. TESOL Quarterly 36 (2), 173–190.
[52] Mayer, Richard E. (2014) Principles based on social cues in multimedia learning: Personalization, voice, image, and embodiment principles. In R. E. Mayer (Ed.), The Cambridge Handbook of Multimedia Learning (pp. 345–368). Cambridge University Press. https://doi.org/10.1017/CBO9781139547369.017
[53] Mayer, Richard E., Dow, Gayle T. and Mayer, Sarah (2003a) Multimedia learning in an interactive self-explaining environment: What Works in the Design of Agent-Based Microworlds? Journal of Educational Psychology, 95 (4), 806–812. https://doi.org/10.1037/0022-0663.95.4.806
[54] Mayer, Richard E., Sobko, Kristina and Mautone, Patricia D. (2003b) Social cues in multimedia learning: Role of speaker's voice. Journal of Educational Psychology 95 (2), 419–425. https://doi.org/10.1037/0022-0663.95.2.419
[55] McAuliffe, Michael, Babel, Molly and Vaughn, Charlotte (2016) Do listeners learn better from natural speech? In Interspeech, San Francisco, USA.
[56] McCroskey, James C. and Teven, Jason J. (1999) Goodwill: A reexamination of the construct and its measurement. Communications Monographs 66 (1), 90–103.
[57] McKenzie, Robert M., Kitikanan, Patchanok and Boriboon, Phaisit (2016) The competence and warmth of Thai students' attitudes towards varieties of English: The effect of gender and perceptions of L1 diversity. Journal of Multilingual and Multicultural Development 37 (6), 536–550. https://doi.org/10.1080/01434632.2015.1083573
[58] Myers, Scott A. and Martin, Matthew M. (2018) Instructor credibility. In Marian L. Houser and Angela Hosek (Eds.), Handbook of instructional communication: Rhetorical and relational perspectives (pp. 38–50). Routledge.
[59] Nass, Clifford and Lee, Kwan M. (2001) Does computer-synthesised speech manifest personality? Experimental tests of recognition, similarity-attraction, and consistency-attraction. Journal of Experimental Psychology: Applied 7 (3), 171–181. https://doi.org/10.1037/1076-898X.7.3.171
[60] Podlipský, Václav J., Šimáčková, Šárka and Petráž, David (2016) Is there an interlanguage speech credibility benefit? Topics in Linguistics 17 (1), 30–44. https://doi.org/10.1515/topling-2016-0003
[61] Prinz, Wolfgang (2013) Self in the mirror. Consciousness and Cognition 22 (3), 1105–1113. https://doi.org/10.1016/j.concog.2013.01.007
[62] Prinz, Wolfgang (2017) Modeling self on others: An import theory of subjectivity and selfhood. Consciousness and Cognition, 49, 347–362. https://doi.org/10.1016/j.concog.2017.01.020
[63] R Core Team. (2022) R: A Language and Environment for Statistical Computing [Computer software]. R Foundation for Statistical Computing. Vienna, Austria. https://www.R-project.org/
[64] Reichelt, Maria, Kämmerer, Frauke, Niegemann, Helmut M. and Zander, Steffi (2014) Talk to me personally: Personalization of language style in computer-based learning. Computers in Human Behavior, 35, 199–210. https://doi.org/10.1016/j.chb.2014.03.005
[65] Rey, Günter D. and Steib, Nadine (2013) The personalization effect in multimedia learning: The influence of dialect. Computers in Human Behavior 29 (5), 2022–2028. https://doi.org/10.1016/j.chb.2013.04.003
[66] RStudio Team. (2021) RStudio: Integrated Development for R (Version 2021.09.0) [Computer software]. http://www.rstudio.com/
[67] Sandygulova, Anara and O'Hare, Gregory (2015) Children's perception of synthesised voice: Robot's gender, age and accent. Social Robotics. 7th International Conference, ICSR 2015. Springer International Publishing. https://doi.org/10.1007/978-3-319-25554-5_59
[68] Scharinger, Mathias, Monahan, Philip J. and Idsardi, William J. (2011) You had me at "Hello": Rapid extraction of dialect information from spoken words. NeuroImage 56 (4), 2329–2338. https://doi.org/10.1016/j.neuroimage.2011.04.007
[69] Schneider, Sascha, Beege, Maik, Nebel, Steve, and Rey, Günter Daniel (2022) Psychologische Befunde zum Lernen mit digitalen Medien – ein Überblick [Psychological findings on learning with digital media - an overview]. In Mario A. Pfannstiel & Peter F.-J. Steinhoff (Eds.), E-Learning im digitalen Zeitalter (pp. 581–605). Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-36113-6_28
[70] Schroeder, Noah L., Chiou, Erin K. and Craig, Scotty D. (2021) Trust influences perceptions of virtual humans, but not necessarily learning. Computers & Education, 160, 1–15. https://doi.org/10.1016/j.compedu.2020.104039
[71] Schroeder, Noah L. and Gotch, Chad M. (2015) Persisting Issues in Pedagogical Agent Research. Journal of Educational Computing Research 53 (2), 183–204. https://doi.org/10.1177/0735633115597625
[72] Searle, John R. (1980) Minds, brains, and programs. The Behavioral and Brain Sciences, 3, 417–457. https://doi.org/10.7551/mitpress/3080.003.0009
[73] Skarnitzl, Radek, & Rumlová, Jana (2019) Phonetic aspects of strongly-accented Czech speakers of English. AUC PHILOLOGICA, 2, 109–128. https://doi.org/10.14712/24646830.2019.21
[74] Sutton, Selina J., Foulkes, Paul, Kirk, David and Lawson, Shaun (2019) Voice as a design material: Sociophonetic inspired design strategies in Human-Computer Interaction. In CHI 2019, Glasgow, Scotland, UK. https://doi.org/10.1145/3290605.3300833
[75] Tamagawa, Rie, Watson, Catherine I., Kuo, I. Han, MacDonald, Bruce A. and Broadbent, Elizabeth (2011) The effects of synthesised voice accents on user perceptions of robots. International Journal of Social Robotics 3 (3), 253–262. https://doi.org/10.1007/s12369-011-0100-4
[76] Taubert, Stefan (2022a) tacotron-cli (Version 0.0.3) [Computer software]. https://github.com/stefantaubert/tacotron
[77] Taubert, Stefan (2022b) waveglow-cli (Version 0.0.1) [Computer software]. https://github.com/stefantaubert/waveglow
[78] Teven, Jason J. (2007) Teacher caring and classroom behavior: Relationships with student affect and perceptions of teacher competence and trustworthiness. Communication Quarterly 55 (4), 433–450. https://doi.org/10.1080/01463370701658077
[79] Wickham, Hadley, Averick, Mara, Bryan, Jennifer, Chang, Winston, McGowan, Lucy D., François, Romain, Grolemund, Garrett, Hayes, Alex, Henry, Lionel, Hester, Jim, Kuhn, Max, Pedersen, Thomas L., Miller, Evan, Bache, Stephan M., Müller, Kirill, Ooms, Jeroen, Robinson, David, Seidel, Dana P., Spinu, Vitalie, . . . Yutani, Hiroaki (2019) Welcome to the tidyverse. Journal of Open Source Software 4 (43), 1686. https://doi.org/10.21105/joss.01686
[80] Ylinen, Sari, Uther, Maria, Latvala, Antti, Vepsäläinen, Sara, Iverson, Paul, Akahane-Yamada, Reiko and Näätänen, Risto (2010) Training the brain to weight speech cues differently: A study of Finnish second-language users of English. Journal of Cognitive Neuroscience 22 (6), 1319–1332. https://doi.org/10.1162/jocn.2009.21272
[2] Ahn, Jeahyeon and Moore, David (2011) The relationship between students' accent perception and accented voice instructions and its effect on students' achievement in an interactive multimedia environment. Journal of Educational Multimedia and Hypermedia 20 (4), 319–335.
[3] Andrist, Sean, Ziadee, Micheline, Boukaram, Halim, Mutlu, Bilge and Sakr, Majd (2015) Effects of culture on the credibility of robot speech: A comparison between English and Arabic. In Proceedings of the tenth annual ACM/IEEE international conference on human-robot interaction. https://doi.org/10.1145/2696454.2696464
[4] Atkinson, Robert K., Mayer, Richard E. and Merrill, Mary M. (2005) Fostering social agency in multimedia learning: Examining the impact of an animated agent's voice. Contemporary Educational Psychology, 30 (1), 117–139. https://doi.org/10.1016/j.cedpsych.2004.07.001
[5] Bansal, Shivam and Aggarwal, Chaitanya (2022) textstat (Version 0.7.3) [Computer software]. https://pypi.org/project/textstat/
[6] Bartneck, Christoph, Kulić, Dana, Croft, Elizabeth and Zoghbi, Susana (2009) Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics 1 (1), 71–81. https://doi.org/10.1007/s12369-008-0001-3
[7] Beege, Maik, Schneider, Sascha, Nebel, Steve, Mittangk, Jessica and Rey, Günter Daniel (2017) Ageism – Age coherence within learning material fosters learning. Computers in Human Behavior, 75, 510–519. https://doi.org/10.1016/j.chb.2017.05.042
[8] Beege, Maik, Schneider, Sascha, Nebel, Steve, and Rey, Günter Daniel (2020) Does the effect of enthusiasm in a pedagogical Agent's voice depend on mental load in the Learner's working memory? Computers in Human Behavior, 112, 106483. https://doi.org/10.1016/j.chb.2020.106483
[9] Bent, Tessa and Bradlow, Ann R. (2003) The interlanguage speech intelligibility benefit. Journal of the Acoustic Society of America 114 (3), 1600–1610. https://doi.org/10.1121/1.1603234
[10] Bione, Tiago and Cardoso, Walcir (2020) Synthetic voices in the foreign language context. Language Learning and Technology 24 (1), 169–186. https://doi.org/10125/44715
[11] Boduch-Grabka, Katarzyna and Lev-Ari, Shiri (2021) Exposing individuals to foreign accent increases their trust in what nonnative speakers say. Cognitive Science 45 (11), e13064. https://+ doi.org/10.1111/cogs.13064
[12] Brabcová, Kateřina and Skarnitzl, Radek (2018) Foreign or native-like? The attitudes of Czech EFL learners towards accents of English and their use as pronunciation models. Studie Z Aplikované Lingvistiky 1 (38-50).
[13] Brom, Cyril, Hannemann, Tereza, Stárková, Tereza, Bromová, Edita and Děchtěrenko, Filip (2017) The role of cultural background in the personalization principle: Five experiments with Czech learners. Computers & Education, 112, 37–68. https://doi.org/10.1016/j.compedu.2017.01.001
[14] Brown, Penelope and Levinson, Stephen C. (1987) Politeness: Some universals in language usage. Cambridge: Cambridge University Press.
[15] Buck, Gary (2007) Assessing listening (7. print). The Cambridge language assessment series. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511732959
[16] Carpinella, Colleen M., Wyman, Alisa B., Perez, Michael A. and Stroessner, Steven J. (2017, March). The robotic social attributes scale (RoSAS) development and validation. In Proceedings of the 2017 ACM/IEEE International Conference on human-robot interaction (pp. 254–262).
[17] Castro-Alonso, Juan C., Wong, Rachel M., Adesope, Olusola O. and Paas, Fred (2021) Effectiveness of multimedia pedagogical agents predicted by diverse theories: A meta-analysis. Educational Psychology Review 33 (3), 989–1015. https://doi.org/10.1007/s10648-020-09587-1
[18] Council of Europe (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press.
[19] Council of Europe (2020) Common European Framework of Reference for Languages: Learning, teaching, assessment – Companion volume. Council of Europe Publishing.
[20] Craig, Scotty D. and Schroeder, Noah L. (2017) Reconsidering the voice effect when learning from a virtual human. Computers & Education, 114, 193–205. https://doi.org/10.1016/j.compedu.2017.07.003
[21] Craig, Scotty D. and Schroeder, Noah L. (2019) Text-to-Speech software and learning: Investigating the relevancy of the voice effect. Journal of Educational Computing Research 57 (6), 1534–1548. https://doi.org/10.1177/0735633118802877
[22] Cristia, Alejandrina, Seidl, Amanda, Vaughn, Charlotte, Schmale, Rachel, Bradlow, Ann and Floccia, Caroline (2012) Linguistic processing of accented speech across the lifespan. Frontiers in Psychology, 3, Article 479, 1–15. https://doi.org/10.3389/fpsyg.2012.00479
[23] Dahlbäck, Nils, Swamy, Seema, Nass, Clifford, Arvidsson, Fredrik and Skågeby, Jörgen (2001) Spoken interaction with computers in a native or non-native language - same or different? In Proceedings of INTERACT. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=58a34aa5fd752e29e434cbf9e9d72891715a2d8e
[24] Dahlbäck, Nils, Wang, QuanYing, Nass, Clifford and Alwin, Jenny (2007) Similarity is more important than expertise: Accent effects in speech interfaces. In CHI 2007 Proceedings, San Jose, CA, USA. https://doi.org/10.1145/1240624.1240859
[25] Dai, Laduona, Jung, Merel M., Postma, Marie and Louwerse, Max M. (2022) A systematic review of pedagogical agent research: Similarities, differences and unexplored aspects. Computers & Education, 190, Article 104607, 1–28. https://doi.org/10.1016/j.compedu.2022.104607
[26] Dalsgaard, Christian (2005) Pedagogical quality in e-learning. Eleed(1). https://www.eleed.de/archive/1/78
[27] Davis, Robert O., Vincent, Joseph and Park, Taejung (2019) Reconsidering the voice principle with non-native language speakers. Computers & Education, 140, Article 103605, 1–12. https://doi.org/10.1016/j.compedu.2019.103605
[28] Défossez, Alexander, Synnaeve, Gabriel and Adi, Yossi (2022) denoiser (Version 0.1.4) [Computer software]. facebookresearch. https://github.com/facebookresearch/denoiser
[29] Delibegović Džanić, Nihada and Berberović, Sanja (2021) Lemons and watermelons: Visual advertising and conceptual blending. In Larissa D'Angelo, Anna Mauranen and Stefania Maci (Eds.), Metadiscourse in digital communication (pp. 115–132) Springer. https://doi.org/10.1007/978-3-030-85814-8_6
[30] Do, Tiffany D., Akter, Mamtaj, Choudhary, Zubin, Azevedo, Roger and McMahan, Ryan P. (2022) The effects of an embodied Pedagogical Agent's synthetic speech accent on learning outcomes. In International Conference on Multimodal Interaction (pp. 198–206). ACM. https://doi.org/10.1145/3536221.3556587
[31] Dokovova, Marie, Scobbie, James M. and Lickley, Robin (2022) Matched-accent processing: Bulgarian-English bilinguals do not have a processing advantage with Bulgarian-accented English over native English speech. Laboratory Phonology 13 (1), Article 12, 1–40. https://doi.org/10.16995/labphon.6423
[32] Drager, Katie (2010) Sociophonetic variation in speech perception. Language and Linguistics Compass 4 (7), 473–480. https://doi.org/10.1111/j.1749-818X.2010.00210.x
[33] Ehret, Jonathan, Bönsch, Andrea, Aspöck, Lukas, Röhr, Christine T., Baumann, Stefan, Grice, Martine, Fels, Janina and Kuhlen, Torsten W. (2021) Do prosody and embodiment influence the perceived naturalness of conversational agents' speech? ACM Transactions on Applied Perception 18 (4), 1–15. https://doi.org/10.1145/3486580
[34] Fishero, Sheyenne, Sereno, Joan A. and Jongman, Allard (2023) Perception and production of Mandarin-Accented English: The effect of degree of Accentedness on the Interlanguage Speech Intelligibility Benefit for Listeners (ISIB-L) and Talkers (ISIB-T). Journal of Phonetics, 99, Article 101255. https://doi.org/10.1016/j.wocn.2023.101255
[35] Gill, Mary M. (1994) Accent and stereotypes: Their effect on perceptions of teachers and lecture comprehension. Journal of Applied Communication Research, 22, 348–361.
[36] Gosselin, Leah, Martin, Clara D., Martín, Ana González and Caffarra, Sendy (2022) When a nonnative accent lets you spot all the errors: Examining the syntactic interlanguage benefit. Journal of Cognitive Neuroscience 34 (9), 1650–1669 https://doi.org/10.1162/jocn_a_01886
[37] Hanzlíková, Dagmar and Skarnitzl, Radek (2017) Credibility of native and non-native speakers of English revisited: Do non-native listeners feel the same? Research in Language 15 (3), 285–298. https://doi.org/10.1515/rela-2017-0016
[38] Hayes-Harb, Rachel, Smith, Bruce L., Bent, Tessa and Bradlow, Ann R. (2008) The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts. Journal of Phonetics 36 (4), 664–679. https://doi.org/10.1016/j.wocn.2008.04.002
[39] Holliday, Nicole (2023) Siri, you've changed! acoustic properties and racialized judgments of voice assistants. Frontiers in Communication, 8. https://doi.org/10.3389/fcomm.2023.1116955
[40] Ivanova, Marina and Schmied, Josef (2023) From cues to features: Bridging psycho- and sociolinguistics in the development of non-native English stimuli. TESOL Communications 2 (2), 1–17. https://doi.org/10.58304/tc.20230201
[41] Jimenez-Molina, Angel, Retamal, Cristian and Lira, Hernan (2018) Using psychophysiological sensors to assess mental workload during web browsing. Sensors 18 (2). https://doi.org/10.3390/s18020458
[42] Karakaş, Ali (2017) English voices in 'Text-to-speech tools': Representation of English users and their varieties from a World Englishes perspective. Advances in Language and Literary Studies 8 (5), 108. https://doi.org/10.7575/aiac.alls.v.8n.5p.108
[43] Kartal, Günizi (2010) Does language matter in multimedia learning? Personalization principle revisited. Journal of Educational Psychology 102 (3), 615–624. https://doi.org/10.1037/a0019345
[44] Kiczkowiak, Marek (2018) Native Speakerism in English Language Teaching: Voices from Poland. PhD thesis, University of York. https://etheses.whiterose.ac.uk/id/eprint/20985/
[45] Krenn, Brigitte, Schreitter, Stephanie and Neubarth, Friedrich (2017) Speak to me and I tell you who you are! A language-attitude study in a cultural-heritage application. AI & SOCIETY 32 (1), 65–77. https://doi.org/10.1007/s00146-014-0569-0
[46] Labov, William (1994) Principles of linguistic change: Internal factors. Oxford: Blackwell.
[47] Lev-Ari, Shiri and Keysar, Boaz (2010) Why don't we believe non-native speakers? The influence of accent on credibility. Journal of Experimental Social Psychology 46 (6), 1093–1096. https://doi.org/10.1016/j.jesp.2010.05.025
[48] Lin, Lijia, Ginns, Paul, Wang, Tianhui and Zhang, Peilin (2020) Using a pedagogical agent to deliver conversational style instruction: What benefits can you obtain? Computers & Education, 143, Article 103658, 1–11. https://doi.org/10.1016/j.compedu.2019.103658
[49] Louwerse, Max M., Graesser, Arthur C., McNamara, Danielle S. and Lu, Shulan (2009) Embodied conversational agents as conversational partners. Applied Cognitive Psychology 23 (1244-1255).
[50] Maes, Pattie (1994) Agents that reduce work and information overload. Communications of the ACM 37 (7), 31–40. https://doi.org/10.1007/SpringerReference_85143
[51] Major, Roy C., Fitzmaurice, Susan F., Bunta, Ferenc and Balasubramanian, Chandrika (2002) The Effects of nonnative accents on listening comprehension: Implications for ESL assessment. TESOL Quarterly 36 (2), 173–190.
[52] Mayer, Richard E. (2014) Principles based on social cues in multimedia learning: Personalization, voice, image, and embodiment principles. In R. E. Mayer (Ed.), The Cambridge Handbook of Multimedia Learning (pp. 345–368). Cambridge University Press. https://doi.org/10.1017/CBO9781139547369.017
[53] Mayer, Richard E., Dow, Gayle T. and Mayer, Sarah (2003a) Multimedia learning in an interactive self-explaining environment: What Works in the Design of Agent-Based Microworlds? Journal of Educational Psychology, 95 (4), 806–812. https://doi.org/10.1037/0022-0663.95.4.806
[54] Mayer, Richard E., Sobko, Kristina and Mautone, Patricia D. (2003b) Social cues in multimedia learning: Role of speaker's voice. Journal of Educational Psychology 95 (2), 419–425. https://doi.org/10.1037/0022-0663.95.2.419
[55] McAuliffe, Michael, Babel, Molly and Vaughn, Charlotte (2016) Do listeners learn better from natural speech? In Interspeech, San Francisco, USA.
[56] McCroskey, James C. and Teven, Jason J. (1999) Goodwill: A reexamination of the construct and its measurement. Communications Monographs 66 (1), 90–103.
[57] McKenzie, Robert M., Kitikanan, Patchanok and Boriboon, Phaisit (2016) The competence and warmth of Thai students' attitudes towards varieties of English: The effect of gender and perceptions of L1 diversity. Journal of Multilingual and Multicultural Development 37 (6), 536–550. https://doi.org/10.1080/01434632.2015.1083573
[58] Myers, Scott A. and Martin, Matthew M. (2018) Instructor credibility. In Marian L. Houser and Angela Hosek (Eds.), Handbook of instructional communication: Rhetorical and relational perspectives (pp. 38–50). Routledge.
[59] Nass, Clifford and Lee, Kwan M. (2001) Does computer-synthesised speech manifest personality? Experimental tests of recognition, similarity-attraction, and consistency-attraction. Journal of Experimental Psychology: Applied 7 (3), 171–181. https://doi.org/10.1037/1076-898X.7.3.171
[60] Podlipský, Václav J., Šimáčková, Šárka and Petráž, David (2016) Is there an interlanguage speech credibility benefit? Topics in Linguistics 17 (1), 30–44. https://doi.org/10.1515/topling-2016-0003
[61] Prinz, Wolfgang (2013) Self in the mirror. Consciousness and Cognition 22 (3), 1105–1113. https://doi.org/10.1016/j.concog.2013.01.007
[62] Prinz, Wolfgang (2017) Modeling self on others: An import theory of subjectivity and selfhood. Consciousness and Cognition, 49, 347–362. https://doi.org/10.1016/j.concog.2017.01.020
[63] R Core Team. (2022) R: A Language and Environment for Statistical Computing [Computer software]. R Foundation for Statistical Computing. Vienna, Austria. https://www.R-project.org/
[64] Reichelt, Maria, Kämmerer, Frauke, Niegemann, Helmut M. and Zander, Steffi (2014) Talk to me personally: Personalization of language style in computer-based learning. Computers in Human Behavior, 35, 199–210. https://doi.org/10.1016/j.chb.2014.03.005
[65] Rey, Günter D. and Steib, Nadine (2013) The personalization effect in multimedia learning: The influence of dialect. Computers in Human Behavior 29 (5), 2022–2028. https://doi.org/10.1016/j.chb.2013.04.003
[66] RStudio Team. (2021) RStudio: Integrated Development for R (Version 2021.09.0) [Computer software]. http://www.rstudio.com/
[67] Sandygulova, Anara and O'Hare, Gregory (2015) Children's perception of synthesised voice: Robot's gender, age and accent. Social Robotics. 7th International Conference, ICSR 2015. Springer International Publishing. https://doi.org/10.1007/978-3-319-25554-5_59
[68] Scharinger, Mathias, Monahan, Philip J. and Idsardi, William J. (2011) You had me at "Hello": Rapid extraction of dialect information from spoken words. NeuroImage 56 (4), 2329–2338. https://doi.org/10.1016/j.neuroimage.2011.04.007
[69] Schneider, Sascha, Beege, Maik, Nebel, Steve, and Rey, Günter Daniel (2022) Psychologische Befunde zum Lernen mit digitalen Medien – ein Überblick [Psychological findings on learning with digital media - an overview]. In Mario A. Pfannstiel & Peter F.-J. Steinhoff (Eds.), E-Learning im digitalen Zeitalter (pp. 581–605). Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-36113-6_28
[70] Schroeder, Noah L., Chiou, Erin K. and Craig, Scotty D. (2021) Trust influences perceptions of virtual humans, but not necessarily learning. Computers & Education, 160, 1–15. https://doi.org/10.1016/j.compedu.2020.104039
[71] Schroeder, Noah L. and Gotch, Chad M. (2015) Persisting Issues in Pedagogical Agent Research. Journal of Educational Computing Research 53 (2), 183–204. https://doi.org/10.1177/0735633115597625
[72] Searle, John R. (1980) Minds, brains, and programs. The Behavioral and Brain Sciences, 3, 417–457. https://doi.org/10.7551/mitpress/3080.003.0009
[73] Skarnitzl, Radek, & Rumlová, Jana (2019) Phonetic aspects of strongly-accented Czech speakers of English. AUC PHILOLOGICA, 2, 109–128. https://doi.org/10.14712/24646830.2019.21
[74] Sutton, Selina J., Foulkes, Paul, Kirk, David and Lawson, Shaun (2019) Voice as a design material: Sociophonetic inspired design strategies in Human-Computer Interaction. In CHI 2019, Glasgow, Scotland, UK. https://doi.org/10.1145/3290605.3300833
[75] Tamagawa, Rie, Watson, Catherine I., Kuo, I. Han, MacDonald, Bruce A. and Broadbent, Elizabeth (2011) The effects of synthesised voice accents on user perceptions of robots. International Journal of Social Robotics 3 (3), 253–262. https://doi.org/10.1007/s12369-011-0100-4
[76] Taubert, Stefan (2022a) tacotron-cli (Version 0.0.3) [Computer software]. https://github.com/stefantaubert/tacotron
[77] Taubert, Stefan (2022b) waveglow-cli (Version 0.0.1) [Computer software]. https://github.com/stefantaubert/waveglow
[78] Teven, Jason J. (2007) Teacher caring and classroom behavior: Relationships with student affect and perceptions of teacher competence and trustworthiness. Communication Quarterly 55 (4), 433–450. https://doi.org/10.1080/01463370701658077
[79] Wickham, Hadley, Averick, Mara, Bryan, Jennifer, Chang, Winston, McGowan, Lucy D., François, Romain, Grolemund, Garrett, Hayes, Alex, Henry, Lionel, Hester, Jim, Kuhn, Max, Pedersen, Thomas L., Miller, Evan, Bache, Stephan M., Müller, Kirill, Ooms, Jeroen, Robinson, David, Seidel, Dana P., Spinu, Vitalie, . . . Yutani, Hiroaki (2019) Welcome to the tidyverse. Journal of Open Source Software 4 (43), 1686. https://doi.org/10.21105/joss.01686
[80] Ylinen, Sari, Uther, Maria, Latvala, Antti, Vepsäläinen, Sara, Iverson, Paul, Akahane-Yamada, Reiko and Näätänen, Risto (2010) Training the brain to weight speech cues differently: A study of Finnish second-language users of English. Journal of Cognitive Neuroscience 22 (6), 1319–1332. https://doi.org/10.1162/jocn.2009.21272