Skip to main content

Table 1 Variables used in this study

From: Predicting the difficulty of EFL reading comprehension tests based on linguistic indices

Indices/variables

Definition

1. Narrativity

(Genre and rhetorical structure level) How much a text tells a story or presents characters, actions and procedures

2. Syntactic simplicity

(Syntax level) A subsection of syntax level that assesses a text on the basis of the number of words, their simplicity, and sentence syntactic structure

3. Word concreteness

(Word level) The level of meaningfulness of the content words and evoking of mental images

4. Referential cohesion

(Text base level) The degree of the connectedness of content words and ideas as the text unfolds

5. Deep cohesion

(Situation model) The extent to which clauses and sentences in a text are connected to causal and intentional or goal-oriented connectives

6. Content word overlap

(Word level) The measure of the content word overlap between two adjacent sentences

7. Semantic similarity

(Textbase level) The uniformity of parallel syntactic constructions at the phrase level and also parts of speech

8. CELEX frequency scores

(Word level) The frequency of the words in the CELEX (Baayen et al., 1996) from the early 1991 version of the COBUILD corpus

9. Volume

(Word level) The number of words in a text

10. Abundance

(Word level) The total number of different types (lemmas) in a text

11. HD-D

(Word level) The “probability that a word in a text would be included in a random sample from that text” (Kyle et al., 2021, p.7).

12. MATTR

(Word level) MATTR (Moving average type-token ratio) is the average of “type-token ratios across multiple, overlapping, equal sections in the text” (Kyle et al., 2021, p.8).

13. MTLD-original

(Word level) MTLD-original (The measure of textual lexical diversity) represents the mean number of words to reach a point of type-token ratio stabilization

14. MTLD-w

(Word level) MLTD-w “moving average is a variant of MTLD that uses a moving average approach” (Kyle et al., 2021, p.8)

15. Flesch-Kincaid Reading Ease

A traditional readability formula that measures text readability according to sentence and word length

16. Coh-Metrix L2 Reading Index

This formula consists of three variables of a word overlap index, a word frequency index, and an index of syntactic similarity to examine the readability of a text