The effects of using L1 Chinese or L2 English in planning on speaking performance among high- and low-proficient EFL learners

Speaking constitutes one of the main goals of learning a second language (L2). Despite the increasing attention on the role of planning and language transfer in L2 learning, the combined effect of using different languages and pre-task planning on language production remains unclear. This study investigated whether the use of different languages in planning affects speaking performance and whether the effect differs by language proficiency. A total of 84 students in Chinese universities learning English as a foreign language participated in several speaking tasks after planning using their first language (L1) Chinese or L2 English. Findings showed that using L1 in planning results in significantly higher syntactic complexity, accuracy, and fluency in speaking performance than using L2 in planning, while the difference in lexical diversity were not statistically significant. Further analysis shows that for speech accuracy, the facilitative effect of L1 was stronger among low-proficient than high-proficient learners. Findings from this study support the use of L2 learners’ entire linguistic repertoire in speaking activities and provides implications on speech production theories as well as translanguaging pedagogies.


Introduction
Speaking constitutes one of the main goals of learning a second language (L2).Nevertheless, it has been well recognized that speech production involves a series of complex processes, including conceptualisation, formulation and articulation (Levelt, 1989).Studies also show that speaking can be particularly challenging for L2 learners.With limited attentional capacity and inadequate fluency, L2 learners may not attend to every speaking process and may find speaking effortful and anxiety-provoking (Konopka & Meyer, 2014;Skehan & Foster, 2001).Given the significance as well as difficulties of speaking for L2 learners, finding strategies to facilitate speaking skills has long been a focus for many researchers and educational practitioners (Rabab'ah, 2016).
Pre-task planning, which generally refers to the opportunity to prepare and practice before completing the task, has been widely proposed as a useful speaking strategy (Ellis, 2009).There is ample empirical evidence on the facilitative effects of pre-task planning in mitigating limited processing capacity and improving speaking performance in terms of productivity, complexity, accuracy and fluency (Fukuta, 2016;Li et al., 2015).In addition, it is widely observed that learners tend to deploy their first language (L1) to achieve various goals in L2 learning activities, such as generating ideas for composition writing (van Weijen et al., 2009;Wang & Wen, 2002), making clarification in teacher-student interactions (Glante, 2020), and reconciling discrepancy in group discussion (Mbirimi-Hungwe, 2021).However, previous studies on the direct effect of language transfer mainly focus on L2 written language production and have yielded mixed results (Stapa & Majid, 2012;Wang, 2003).Given the paucity of research comparing the effect of using L1 and L2 in the planning stage of speaking, it remains unclear whether and how pretask planning could be a facilitative strategy to support L2 learners' speech production.Therefore, this study aimed to investigate the combined effect of pre-task planning and the use of different languages on L2 speaking performance.Potential findings from the current study may help illuminate the role of, enrich our understanding of, the implementation of translanguaging strategies in L2 classroom and how students' linguistic capital could be better utilized in supporting their learning and oral communication.

Theoretical accounts of speech production in L2
The conceptualization of speaking production process in L2 has experienced developments over the past decades.One of the earliest and most classic models of speaking to date was proposed by Levelt (1989), which theorized speaking into three major processes, including conceptualisation, formulation, and articulation.According to this model, speaking starts with generating preverbal ideas according to the communication goal, followed by translating ideas into linguistic forms, and finally, speakers articulate their linguistic plans as actual speech.While this model was initially developed for L1 speech production, Kormos (2006) extended Levelt's (1989) model to L2 learners by taking into account the inadequate linguistic knowledge of L2 learners, proposing that speaking in L2 requires more attention to pre-articulation stages, such as conceptualisation and formulation.This model provides an updated theoretical account of L2 speech production, which also resonates with Skehan's (1998) trade-off hypothesis in taking into account the limited attention and cognitive resources L2 learners may allocate to various processes of speaking.
Numerous studies have delved into the cognitive processes in speech production, in particular, how learners prioritise and allocate attentional resources at different stages of speech production.For example, Scott (1996) pointed out that idea generation constitutes the fundamental stage and thus requires more attention than linguistic formulation stage.Similarly, Bygate (1996) suggested that L2 learners generally focus on preparing the content rather than the language of expression, especially when first carrying out a task.A more recent study by Konopka and Meyer (2014) also echoed that when planning a long message in L2, speakers tend to process the content first and postpone linguistic encoding until they have generated the ideas.Taken together, these studies collectively indicate the possibility that the processing demands concerning content leads to less attention being paid to linguistic forms, which might in turns lead to lower speech accuracy and fluency (Skehan & Foster, 2001).
Apart from cognitive challenges, speaking has also been identified as an anxiety-provoking activity for L2 learners.Comparisons between the major literacy skills showed that speaking generally triggers higher level of anxiety among learners than reading, writing, and listening (Ay, 2010).In addition, speaking skills are also found to play an influential role in oral communication both in and outside L2 classroom.For instance, Liu and Jackson (2008) found that L2 learners in China lacking willingness to communicate in English are reluctant to participate in English classes.It is also suggested that L2 learners' efforts to deal with the stress of speaking situations may further interfere with their successful coordination of various cognitive processes in speaking (Teimouri et al., 2019).Given the influential role of speaking skills in daily communication as well as classroom participation, finding strategies to mitigate speaking anxiety and facilitate L2 speaking has long been a concern for many researchers and educational practitioners (Rabab'ah, 2016).

Pre-task planning as a strategy for L2 production
Pre-task planning generally refers to the opportunity for learners to prepare and practice before completing a task (Ellis, 2009).It has been well documented in previous literature as a useful strategy to ease learners' working memory load during performing the task and provide more processing space for improving the overall performance of language production (O'Sullivan, 2012;Skehan, 2016).In terms of Levelt's (1989) theoretical framework of speech production, pre-task planning has the potential to facilitate the pre-articulation stages such as conceptualisation and formulation in speaking.To illustrate, with the use of pre-task planning strategy and the opportunity to prepare the content of speech in advance, learners have more content readily available in early speaking processes and more processing space for linguistic formulation, which in turns lead to better speaking performance (Fukuta, 2016;Li et al., 2015).In addition to cognitive advantages, pre-task planning has also been found to mitigate speaking anxiety of L2 learners.For example, Ay (2010) found that speakers' anxiety decreases significantly when they are allowed to plan before speaking, while a survey conducted by Elder and Wigglesworth (2006) reported that L2 learners expressed a strong preference for planning.Overall, the studies reviewed above provide empirical support for the potential role of pre-task planning strategy in facilitating L2 learners' speaking performance, willingness to communicate, and classroom participation.
Given the facilitative role of pre-task planning strategy in L2 learning, increasing studies have made closer examinations on different planning conditions, such as planning time (Li et al., 2015) and planning type (O'Grady, 2019).For example, Stroud (2021) compared different types of planning and found that students who attend to both content and linguistic forms in planning tend to make more elaborate points in their final performance, whereas students focusing on repeating their original plans show improvement in speech productivity and fluency.Another group of research paid attention to the aspect of speaking which might be affected by planning.For instance, Robinson (2003) suggested that the use of planning strategy could facilitate both the complexity and accuracy of speech.In contrast, other scholars argue that planning is effective in enhancing complexity at the price of accuracy (Ellis, 2009;Skehan, 1998).Given the mixed findings in previous studies, it remains unclear which aspect of speaking can be better facilitated by the use of planning strategy and how planning strategy can be better implemented to support L2 learners' speech production.It is thus significant for further study to take a closer examination on the effect of planning on different aspects of L2 speaking performance.

The prevalent use of L1 in L2 production
A central issue in the field of second language education is the role of learners' L1 plays and can play in the learning of L2 literacy skills.Specifically, a topic of debate over decades of research has been whether to allow the use of L1 in L2 classrooms.On the one hand, some scholars have argued that L1 use should be restrained due to its minimal or even negative effects on L2 learning (Chamber, 1991;Halliwell & Jones, 1991).Similar concerns towards the negative effect of L1 include the linguistic distance between L1 and L2 (Odlin, 1989) and the negative transfer of cultural-specific knowledge built with L1 (Myles, 2002).Taken together, these studies argue for a potential interference from L1 on both linguistic forms and content of L2 speaking.Therefore, researchers have emphasised for decades the importance of providing a single-language immersive environment in L2 learning (Macdonald, 1993;Myles, 2002).
Standing in contrast, proponents of multilingualism and translanguaging practices maintain that learners' L1 can serve as a compensatory strategy to mitigate cognitive overload (Scott & de la Fuente, 2008), lower affective filters (Yüzlü & Atay, 2020), improve the quality of L2 production (Stapa & Abdulmejid, 2009), as well as to resist linguistic imperialism (Konopka et al., 2018).Specifically, allowing learners to use their L1 in planning has been proposed as a scaffolding and motivating strategy to facilitate L2 written production, with empirical evidence on its effect on better writing quality (Woodal, 2002) and greater willingness to write (Stapa & Majid, 2012).
It is also noteworthy that, despite the unsettled disputes on the extent to which L1 use should be allowed in L2 learning, it has been widely observed that L2 learners tend to use their L1 in different processes of L2 production.For example, van Weijen et al. (2009) found that students in the Netherlands often drew on their L1 in accomplishing L2 writing tasks, whereas Sasaki (2002) found that novice English learners in Japan tended to produce English texts by translating from Japanese.Moreover, a survey conducted by Cohen and Brooks-Carson (2001) reported that 80 percent of intermediate French learners in the US prefer thinking in their L1 and then writing in L2.Similarly, Wang and Wen (2002) found that L2 learners in China made extensive use of their L1 to generate ideas, formulate texts and monitor their writing process.These studies, while mainly focusing on examining the role of using L1 in L2 written production, collectively suggest that L2 learners across different language backgrounds or target languages tend to use L1 in accomplishing L2 tasks.The prevalence of L1 use among L2 learners also underscores the significance of the current study to compare the effect of L1 and L2 on spoken production.Liu and Yeung Asian. J. Second. Foreign. Lang. Educ. (2023) 8:35 The interplay between L2 proficiency and L1 transfer The extent to which L2 learners use their first language capital depends on various factors, among which language proficiency has been suggested as one of the most important ones by both theoretical and empirical literature.On the one hand, theoretical frameworks such as models of writing (Hayes & Flower, 1980;Juel et al., 1986) and speaking (Kormos, 2006;Levelt, 1989) similarly emphasize the linguistic transformation or formulation stage in language production.This is the stage where learners search for appropriate linguistic expressions in order to translate their abstract ideas into utterances or texts, and where their language proficiency is supposed to play a role.Given the limited cognitive resources as suggested by the trade-off hypothesis discussed above (Skehan, 1998) and the complicated interplay between conceptual and linguistic generation processes in language production models (Kormos, 2006;Levelt, 1989), it is of theoretical significance to investigate whether and to what extent learners' L2 proficiency affect their oral language production.A closer examination of different aspects of speech production is thus expected to illuminate the interplay between the conceptualization stage and linguistic formulation stage in L2 production models.
In terms of empirical findings, numerous studies have set out to investigate how L2 proficiency of learners affects their code-switching patterns.Some have shown that L2 learners with lower proficiency switch to their L1 more often (Woodall, 2002) and benefit more by using strategies such as directly translating their L1 texts into L2 (Cohen & Brooks-Carson, 2001).On the contrary, others found that learners with higher proficiency in L2 tend to use L1 more often (Scott & de la Fuente, 2008) and are more successful in transferring their L1 resources (Wang, 2003).While mixed findings have been reported in terms of the role of L2 proficiency in language transfer in writing tasks (e.g., Stapa & Abdulmejid, 2009;Wang & Wen, 2002), it remains unclear to what extent learners' L2 proficiency affects their use of L1 in oral production and whether there is any interaction effect between language proficiency and the language used in pre-task planning.

Research question
A review of theoretical and empirical literature highlights the significant role of pretask planning as well as L1 transfer as two widely recognized strategies in L2 learning.Additionally, the literature review above also suggests the complicated process of spoken language production and an inadequate understanding of the combined effects of pre-task planning and language transfer in L2 learners' speech production.Given the important role of speaking skills play in classroom as well as daily communication, the current study sets out to address the following research questions: 1. What are the differences between using Chinese L1 and English L2 in pre-task planning on English speaking performance among Chinese EFL learners?2. To what extent does the effect of using different languages in pre-task planning differ by learners' L2 proficiency?

Method
To examine the combined effects of pre-task planning and language transfer on L2 oral production among learners with different levels of L2 proficiency, this study adopted a 2 × 2 mixed design in an experimental setting with two independent variables.The between-subject variable is English L2 proficiency (high vs low).The within-subject variable is the language used in pre-task planning (Chinese L1 vs. English L2).The dependent variables are five measures of speaking performance, including one for syntactic complexity, two for lexical diversity, one for accuracy, and one for fluency.

Participants
A total of 84 university students learning English as a foreign language (EFL) in China (49 females and 35 males, Mean age (years) = 21.34,SD age = 1.15) participated in this study.This study used convenience sampling to recruit the participants by distributing posters online and all participations were voluntary.Participants were selected by their English learning experience and proficiency.In this study, all participants had Chinese as L1 and had been learning English as L2 for over 15 years.In addition, all participants had taken the Test of English as a Foreign Language (TOEFL), a standardised English proficiency test, and thus had prior experience of taking the speaking tests used in this study.Following the design of previous studies (Elder & Iwashita, 2005;Kawauchi, 2005;Yuan & Ellis, 2003), participants' L2 proficiency were estimated from their reported TOEFL speaking score (M = 22.19, SD = 4.37, Minimum = 15, Maximum = 30).Based on the criteria provided on the Educational Testing Service (ETS) official website, participants with speaking scores of between 25 and 30 were classified as high-proficient speakers (N = 27, M = 27.22,SD = 1.95), participants with speaking scores of between 20 and 24 were classified as speakers with average proficiency (N = 28, M = 22.28, SD = 1.47), and participants with speaking scores below 19 were classified as low-proficient speakers (N = 29, M = 17.34,SD = 1.47).Given the aim of this study is to examine whether using different languages in pre-task planning might lead to significant differences in speaking performance between high-and low-proficient participants, data analysis was conducted on participants in high-and low-proficiency groups.To facilitate the interpretation of the TOEFL speaking scores among researchers and educators, a matching scale between TOEFL speaking scores, speaking proficiency level, the Common European Framework of Reference for Languages (CEFR) and American Council on the Teaching of Foreign Languages (ACTFL) assessment based on the official information provided by ETS (https:// www.ets.org/ toefl/ score-users/ ibt/ inter pret-scores.html) and ACTFL (https:// www.actfl.org/ uploa ds/ files/ gener al/ Assig ning_ CEFR_ Ratin gs_ To_ ACTFL_ Asses sments.pdf ) (Table 1).

Speaking test
All of the speaking tests used in this study were taken from TOEFL Speaking Test 1, in which participants have to answer a decision-making question in the form of 'Some people prefer [A]; others prefer [B].Which one do you prefer and why?' The topics are mainly about daily experiences, college life and social phenomena, which were supposed to be familiar to university students and have been widely adopted in previous studies to test the effect of planning on speaking (Ellis, 2009;Li et al., 2015).This study adopted a within-subject design to test the effect of using different languages on speaking performance, in which each participant was asked to take two speaking tests, one with Chinese L1 and one with English L2 as the language used in pre-task planning.Both the sequence of pre-task planning conditions (Chinese L1 vs. English L2) and the topics of the speaking test were randomly assigned in order to reduce practice and fatigue effects on participants.

Measures
The assessment of speaking performance followed a well-established measurement framework widely adopted in previous studies (Guará-Tavares, 2008;Li et al., 2015;Yuan & Ellis, 2003), in which participants' speech was scored from three perspectives, namely, complexity, accuracy, and fluency (CAF).

Complexity
Complexity refers to speakers' ability to use more advanced language, which can be further divided into syntactic complexity and lexical diversity (Mochizuki & Ortega, 2008;Yuan & Ellis, 2003).Syntactic complexity was commonly measured by the number of words per AS-unit (Skehan & Foster, 2005;Tavakoli & Skehan, 2005).AS-unit is defined as a single utterance consisting of an independent clause and the subordinate clause(s) attached to it, which serves as a standard tool for analysing L2 speech (Foster et al., 2000).The number of words per AS-unit was calculated by dividing the total number of words over the total number of AS-units.False starts, self-corrections and repetitions were excluded from the word count.
In terms of lexical diversity, different approaches have been proposed to capture the complexity of vocabulary use in written and spoken discourse.One of the most commonly used measures is the type-token ratio (TTR), which is calculated by the number of word types over the number of word tokens (Li et al., 2015;Rostamian et al., 2018).This calculation results in the range of TTR between 0 and 1, with a higher figure indicating  , 2010).In this study, both TTR and vocd-D were used to capture the lexical diversity of spoken production.

Accuracy
Accuracy of speech pertains to speakers' ability to avoid errors and handle different aspects of speaking, which was commonly measured by the percentage of errors in the speech, including self-corrections, false starts, repetitions, pauses and mistakes in syntactic, morphological, and lexical levels (Ellis, 2005).It is thus noteworthy that a higher score often indicates more errors made in speech and less accuracy.This measurement of accuracy has proven to be a good indicator of speaking skills and has been widely used in previous studies (Guará-Tavares, 2008;Li et al., 2015;Yuan & Ellis, 2003).

Fluency
Fluency of speech refers to speakers' ability to produce meaningful utterances in real time, which is generally measured by pruned words per minute (Tavakoli & Skehan, 2005).Pruned words refers to the number of words, excluding false starts, reformulations, repetitions and fillers such as um and uh (Guará-Tavares, 2008).

Procedure
Participants in this study were first recruited by distributing posters through online platforms.After signing the consent form, participants were asked to fill out a questionnaire on Qualtrics to collect demographic information and details of their English learning experience.Then, participants were invited to take a series of computer-based speaking tests, in which they participate individually to ensure that their speaking performances would not be influenced by each other.Before starting the speaking test, the researcher first explained and demonstrated the general routine for the speaking task.Then, participants were asked to take some warmup tests to familiarise themselves with the process.After indicating their readiness, participants were instructed to complete two formal speaking tests varying by the language used in planning (Chinese L1 vs. English L2) and the sequence of speaking topics.Each test consisted of two stages.The first stage was the pre-task planning stage, in which participants were given 15 s to read the question on the screen and then articulate their answers in the given language (Chinese L1 vs. English L2) within 45 s.The second stage was the formal speaking stage, in which participants answered the same speaking question in English L2 only.The within-subject design on the type of language used for planning was thus achieved by asking participants to use their L1 Chinese or L2 English in the first stage.
Participants' answers in the second stage were audio-recorded and scored as their L2 speaking performance.The consent forms, questionnaires and speaking tests were all provided in English.The entire experiment for each participant lasted around 60 min.Liu and Yeung Asian. J. Second. Foreign. Lang. Educ. (2023) 8:35

Data analysis
The analysis of speaking performance was guided by complexity, accuracy, and fluency (CAF) framework mentioned above.To calculate the five indexes (two for lexical diversity, one for syntactic complexity, accuracy and fluency, respectively), recordings of spoken answers were first transcribed by the author and double-checked by a second researcher to ensure the consistency.The transcriptions were then sent to the Ant-Conc 3.2.1 software to calculate the word types and tokens in order to calculate fluency (number of words per minute) and TTR.To calculate the syntactic complexity (number of words per AS-unit), the transcriptions were also manually segmented by AS-units.
To calculate accuracy (number of errors/total words), the researchers carefully listened to the recordings and counted the number of errors made in each speaking test.
A randomly selected sample of 15% of the total data was examined by another trained researcher, and the inter-rater reliability was above 90% on all measures.

Results
This research aims to investigate whether using different languages in pre-task planning affects L2 speaking performance and whether their effects differ by learners' L2 proficiency.To address this question, SPSS version 29.0 was used to conduct statistical analyses.Table 2 shows the mean and standard deviations of the five measures in the two planning conditions (Chinese L1 vs. English L2) among the high-and low-proficient L2 learners.To test the potential differences made by L2 proficiency and the type of language used for planning, a repeated measures multivariate analyses of variance (MANOVA) was first conducted by using planning language (L1 vs L2) as within-subject factors, English proficiency (high-proficient vs low-proficient) as a between-subject factor, and five measures of speaking performances as the dependent variables.Results from the MANOVA showed a significant main effect of planning language [Wilks' λ = 0.74, F (1, 54) = 19.17,p < 0.001], and a significant interaction effect of planning language and measures [Wilks' λ = 0.45, F (3, 52) = 21.66,p < 0.001].The interaction effect of planning language, measures and proficiency was not significant [Wilks' λ = 0.94, F (3, 52) = 1.213, p = 0.31].To further locate where the statistical significance was obtained, a series of two-way repeated measures analyses of variance (ANOVAs) were conducted on each measure of speaking performance.Below, we present the respective ANOVA results for each dependent variable.

Accuracy
A two-way ANOVA with speech accuracy scores1 as the dependent variable revealed a significant main effect of planning language with a large effect size, F (1, 54) = 29.61,facilitative effect was stronger on low-proficient learners than on high-proficient learners (see Fig. 3).

Discussion
This study aimed to investigate whether using different languages in pre-task planning affects speaking performance and whether the effect differs by language proficiency.Results from Chinese EFL university students showed that using L1 to plan leads to better speaking performances in terms of syntactic complexity, accuracy, fluency, but not lexical diversity.In addition, a significant interaction effect between learners' L2 proficiency and the language used in planning was found in terms of speech accuracy, with low-proficient learners benefit more by using L1 to plan.Our first research question concerned the main effect of using different types of language (L1 vs. L2) on four aspects of L2 speaking performance, including syntactic complexity, lexical diversity, accuracy, and fluency.Based on the theoretical conceptualization of three stages in speech production (Kormos, 2006;Levelt, 1989), it is hypothesized that using L1 to plan is more relevant to the conceptualization stage in L2 speech production, while planning in L2 will benefit both the conceptualization and linguistic formulation stage in L2 speech production.Thus, it is intuitive to expect that planning in L2 might lead to better speaking performance.However, findings from this study stand in contrast to this assumption by showing that using L1 to plan has more facilitative effects on syntactic complexity, accuracy and fluency of speech regardless of learners' L2 proficiency.Below, these findings were discussed in light of related theoretical and empirical literature.

The facilitative role of L1 planning in the conceptualization stage of speaking
One explanation for the facilitative effect of L1 is that involving L1 in the conceptualization stage of speaking allows L2 learners to generate speech content in their most familiar language and retrieve more relevant information.This is because memory-based knowledge is highly associated with the source language and information coded in L1 is generally more accessible (Schmiedtova, 2011).In addition, given the previous literature on limited cognitive resources and the trade-off hypothesis (Skehan, 1998), using L1 to plan may assist the retrieval of background information by freeing speakers from the linguistic constraints of L2, allowing them to generate more ideas about the speaking topic, and thus result in better speaking performances (Friedlander, 1990;Lally, 2000).Findings from this study not only lends empirical support to the facilitative role of language transfer in the conceptualisation stage in speech production theories (Kormos, 2006;Levelt, 1989), but also echo previous research on writing, which found that learners using L1 to plan produced more detail in L2 composition (Stapa & Abdulmejid, 2009).Another possible explanation is that using L1 to plan helps mitigate learners' speaking anxiety and improve their willingness to produce their speech with more details, given that speaking anxiety is found to be associated with learners' willingness to communicate (Tavakoli & Davoudi, 2017).In addition, studies have also shown that L2 learners are more willing to express and communicate in L2 classroom in which the use of L1 is allowed (Stapa & Majid, 2012;Yüzlü & Atay, 2020).Taken together, with increasing willingness to communicate as well as more ideas generated at the conceptualization stage, learners using L1 to plan are likely to produce longer and more subordinate clauses produced to elaborate on their points, which in turns leads to higher syntactic complexity scores of their L2 speech (Skehan & Foster, 2005).

The facilitative role of L1 planning in the linguistic transformation stage of speaking
While the facilitative effects of using L1 to plan on syntactic complexity might be attributed to more ideas generated at the conceptualization stage, results from this study also showed that planning in L1 leads to more accurate and fluent speech.These findings are unexpected given that the accuracy and fluency of speech are supposed to be key to the linguistic transformation or formulation stage, at which using L2 to plan is more likely to be advantageous (Levelt, 1989).Given the consistency between the language used in planning and speaking, both content and linguistic forms of speaking prepared by using L2 in planning are readily available during speaking.Findings from this study, however, stand in contrast to this hypothesis.One possible explanation is that the potential linguistic advantages of L2 planning may not be sufficient to surpass the benefits of using L1 to plan.According to the trade-off hypothesis (Skehan, 1998), speakers have limited attentional and cognitive resources and thus cannot fully attend to every stage of spoken language production.Given that idea generation is always the first and fundamental stage of speaking, speakers are more likely to prioritize the conceptualization stage rather than the linguistic formulation stage (Scott, 1996;Stapa & Majid, 2012).In other words, even though using L2 allows one to prepare both content and linguistic forms of speech, speakers are more likely to allocate more attention to preparing the content of speech during planning (Ellis, 2009;Foster & Skehan, 2013).Therefore, planning in L2 may not have significantly more facilitative effects on speech accuracy or fluency compared to planning in L1.
On the contrary, speakers using L1 to plan are subjected to fewer linguistic constraints (Friedlander, 1990) and do not have to attend to formulation issues, such as finding the proper words and following morphological and syntactic rules in L2.In other words, using L1 in planning helps reduce learners' cognitive load on recalling previous linguistic plans and free up attention resources.The lighter burden during planning not only leads to better idea generation but also allows more controlled attention to linguistic formulation during speaking (Foster & Skehan, 2013;Skehan, 1998;Wendel, 1997).In sum, using L1 to plan, L2 learners are likely to better allocate their attention between conceptualization and formulation in speech production.The use of L1 in planning economizes on processing resources and creates more mental space for the consequent linguistic formulation, leading to more accurate and fluent speech.
It is also noteworthy that, while the second research question of this study concerns whether the effect of planning language would differ by learners' L2 proficiency, the significant interaction effect between language use and L2 proficiency was only found in terms of speech accuracy but not syntactic complexity, accuracy or lexical diversity.This finding seems to indicate that using L1 to plan largely benefits both high-and low-proficient learners to a similar extent.Although low-proficient learners benefit significantly more from planning in L1 than high-proficient learners in terms of speech accuracy, the facilitative effects were not substantial.Our findings on the facilitative effects of planning in L1 regardless of speakers' proficiency also resonate with previous studies, which found that L1 use can serve as a facilitative strategy for both high-and low-proficient L2 writers (Stapa & Majid, 2012;Wang, 2003).

Lack of significant effects of planning language on lexical diversity of speech
Although using L1 to plan has beneficial effects on syntactic complexity, accuracy and fluency of speech, no significant effect on lexical diversity was observed between using different languages in planning and between learners with high and low proficiency.This finding challenges the simple view that language transfer will have either facilitative or negative effects in L2 production.In addition, previous studies have shown mixed results in terms of the effect of planning on lexical diversity.While some studies have reported significant differences in lexical diversity under different planning conditions (Bygate, 1996;Tajima, 2003), others found that lexical richness of L2 writing remains relatively stable under different planning conditions (Lally, 2000;Li et al., 2015).In the present study, using different languages to plan did not lead to any significant difference in lexical diversity of speech.One explanation for this result may, again, be related to the trade-off hypothesis (Skehan, 1998).As is discussed above, speakers with limited cognitive resources tend to channel their attention to prepare the content of speech rather than searching for more complex words during planning.It is also possible that speakers, particularly L2 speakers, may choose to maintain the accuracy of their speech at the price of lexical complexity by using simple words in spontaneous oral production (Skehan, 1998), particularly in the test setting.This is also supported by empirical research showing that test-takers put more value on the fluency and accuracy of their speech and tend to use simple words to avoid errors rather than taking risks to use cutting-edge vocabulary (Li et al., 2015).
Another explanation for our results may be related to the design of this study.On the one hand, although this study adopted the most commonly used design in previous studies by setting a time limit to each planning and speaking task, there is a concern that time constraints on tasks may not lead to any observable change in lexical diversity of speech (Li et al., 2015).For example, in Mehnert's (1998) study, only planning time above 10 min resulted in greater lexical density.Therefore, one possible account for our findings on lexical diversity is that the effect of different languages to plan may not be sufficient to make any stable or significant difference in lexical diversity due to the time constraint.On the other hand, as noted previously in the method section, the measurement of lexical diversity adopted in this study, particularly the type-token ratio, may not be a reliable indicator and may not fully capture the advanced vocabulary used by speakers.Additional support to this account could be drawn from a large group of empirical studies that failed to find any significant differences in lexical diversity across different conditions with the use of type-token ratio (Kawauchi, 2005;Ortega, 1999;Yuan & Ellis, 2003).

Conclusion
To conclude, this study investigated whether the use of different languages in planning affects speaking performance and whether the effect differs by language proficiency.With a total of 84 EFL learners in Chinese universities as the participants, this study found that planning in L1 has significant and substantial facilitative effect on the syntactic complexity, accuracy, and fluency of L2 speech, regardless of learners' L2 proficiency.Further analysis shows that for speech accuracy, the facilitative effect of L1 was stronger among low-proficient than high-proficient learners.There are both theoretical and pedagogical implications to be drawn from these findings.In terms of theoretical contribution, this study builds on classic models of speaking and delves into the conceptualization and linguistic formulation stage of speech production by separating pre-task planning and formal speaking as two activities.Our findings on the facilitative role of using L1 to plan in L2 speaking performance provide empirical evidence for the tradeoff hypothesis (Skehan, 1998) and illuminate the complicated interplay between a series of processes in speech production, such as the prioritization of conceptualization over linguistic formulation and speech accuracy over lexical diversity.Moreover, findings on the facilitative effect of L1 lend empirical support to the claim that information stored in and retrieved from memory is often language-encoded, and that using a more familiar language may help L2 learners brainstorm more ideas.This, in turn, suggests that content and linguistic form are not completely distinct but rather influence each other.
This study also extends previous literature on L2 writing by showing a similarly significant role of L1 in facilitating L2 speaking performance, suggesting some shared cognitive processing patterns between written and spoken production.Furthermore, findings from the present study indicate a potential use of L1 in assisting L2 learners to generate more ideas when given a speaking topic, and thus provides pedagogical implications to educational practitioners on making strategic use of L1 to scaffold idea generation as well as mitigate learners' speaking anxiety during the preparation stage of speaking.However, it is particularly important to note that the extent to which L1 should be used in learning L2 is by no means conclusive based on the findings from the current study.It is thus suggested that language instructors, rather than blindly recommend the use of L1, should assist learners in developing their own speaking strategies with different degrees of L1 use, and stay open to learners making diverse uses of planning strategies.
Despite the aforementioned theoretical and pedagogical implications, this study notably bears a series of limitations.First, as discussed earlier, participants' speaking performances in this study were measured by automatic calculations of linguistic features, such as the number of word types and tokens.These measures provided limited insights into the quality of speech content as well as the extent to which advanced vocabulary was used in speaking.The measurement framework of complexity, accuracy, and fluency (CAF), though widely adopted by previous studies, may not fully capture the nuanced differences in learners' speaking performances between different experimental conditions.Additionally, with limited types of planning conditions and test formats, participants' speaking skills were examined under a testing situation and probed by decision-making questions.As is noted by Foster (1996), planning is found to benefit accuracy only in decision-making tasks but not in narrative tasks.Thus, findings from the current study did not provide a comprehensive picture of the effect of planning languages and may be more applicable to a small range of educational settings.It is also noteworthy that, by using TOEFL speaking scores as the criteria for classifying participants into different proficiency groups, this study did not take a wide range of factors into account, such as L2 learners' mastery of other literacy skills and vocabulary knowledge.Socio-emotional factors such as speaking anxiety and willingness to communicate were also inadequately investigated in this study.Given the limitations discussed above, future studies are suggested to consider whether factors related to and beyond L2 proficiency play a role in speech production, and to what extent planning in L1 might facilitate L2 learners in the long run.Followup research with other measures of speaking performance, such as the use of rating scale or subjective judgements made by experienced teachers, may also provide more insights into the nuanced differences between speaking performances under different planning conditions.Potential findings from these studies are likely to enrich the extant understanding of the role of language transfer and planning strategies in L2 speaking and inform better educational practices in L2 classrooms.

Fig. 1 Fig. 2 Fig. 3
Fig. 1 Syntactic complexity scores split by English proficiency (High vs. Low) and planning language (Chinese L1 vs English L2).Error bars represent standard errors for each dimension

Fig. 4
Fig. 4 Fluency scores split by English proficiency (High vs Low) and planning language (Chinese L1 vs. English L2).Error bars represent standard errors for each dimension

Table 1
Corresponding scores between TOEFL iBT speaking test, CEFR level, and ACTFL assessment

CEFR level ACTFL (LPT, RPT or L&Rcat) ACTFL (OPI, OPIc or WPT)
DeBoer, 2014;McCarthy & Jarvisvious studies, however, there is a growing concern that TTR might be subjected to text length and thus may not be a reliable indicator of lexical diversity.Recent research has proposed vocd-D(Malvern & Richards, 1997)as a more reliable measure of lexical diversity (for reviews on lexical diversity measures, seeDeBoer, 2014;McCarthy & Jarvis

Table 2
Descriptive statistics for speaking performances measured by complexity, accuracy, and fluency (CAF) split by English proficiency and language used for planning