Skip to main content

Table 1 The lexical profile of chemistry RAs based on BNC/COCA lists

From: Developing and validating a mid-frequency word list for chemistry: a corpus-based approach using big data

BNC-COCA lists

Token

Token%

CumToken%

Type

Group

1

144158658

51.86

51.86

5639

999

2

31538369

11.34

63.2

5273

1000

3

30100486

10.83

74.03

5010

1000

4

8785085

3.16

77.19

3839

996

5

5529036

1.99

79.18

3220

985

6

3410422

1.23

80.41

2971

976

7

3087538

1.11

81.52

2493

955

8

1873841

0.67

82.19

2236

923

9

1252030

0.45

82.64

1949

907

10–30

9895518

3.58

86.22

16562

10800

31–34

20637310

7.42

93.64

16409

15002

0

17731707

6.38

100.02

616952

616952

Total

278000000

  

682553

651495