Comments and Questions

Use this page to ask a question or tell us about something about TextInspector. Or you can email us at


  1. Dear Text Inspector team,

    I was wondering how to interpret the frequency metrics in your tool (Lexis: BNC or COCA). In my report, I have 1K -2K as 11.26%, for example. What does that mean? Does it mean that I have 11% of words with 1000-2000 word frequency (which is not very frequent)? The higher the K is, the higher the frequency?


    1. Text Inspector Help Team February 13, 2018 at 11:54 pm

      Dear Irina

      Thank you for your query.

      So, if the metrics says: 1K -2K as 11.26% then it means that 11.26% of words that are used in the text belong to the 1K-2K lists in BNC or COCA.

      1K-2K lists mean that these words are very frequently found in the respective corpus (BNC, COCA). When you see a term like ‘1K list’, that means it is a vocabulary list of 1000 words which are found to have been most frequently used/found in that corpus. So 1K means they are the first 1000 most frequently used/used words.

      The higher the percentage of K1-K2 (or whatever early numbers come with the ‘K’), the more frequently used vocabulary the text includes. If there are some words used from the K6, 7 or higher, then they are regarded as less frequently used words and therefore tend to be more sophisticated or difficult words. (There’s also discussion around the issue that the less frequently used doesn’t necessarily mean they are sophisticated or more difficult – but as a general trend, that’s the way it’s interpreted.)

      Thomas Cobb at the University of Quebec, and Norbert Schmitt at the University of Nottingham are academics who have published a lot in the area of frequency metrics if you would like to read further.

      If you have any other questions please do not hesitate to ask.

      Text Inspector Help Team.


      1. Thank you so much for your reply! Textinspector is a great tool, I am using it in my PhD readability research and I am so happy it shows a variety of readability metrics for a given text.


  2. Sonthaya Rattanasak July 24, 2017 at 7:35 pm


    I intend to subscribe for a one-month period. Do I need to turn off any auto renewal ?
    Please advise as I don’t seem to find the setting for this.



    1. Text Inspector Help Team August 14, 2017 at 4:35 pm

      Hello – If you subscribe and then turn off auto renewal, you will still get your full month’s use as expected.

      If you ever have any problems please simply email us on and we can give you a refund for any payment you did not mean to send!

      Best wishes,

      Julie Harris
      Text Inspector Help Team


  3. Hi can I purchase the individual copy then late upgrade to the organization one?what are the charges for upgrade?

    Kind Regards,

    Nathan M. Kimaku
    Information Technology (IT) Assistant
    East African Educational Publishers Limited


    1. Hello, there is no charge for an upgrade. If you switch from one package to another and there is a price difference we will refund you.


  4. Philip McCarthy July 1, 2017 at 12:54 pm

    Hi … I’m Phil McCarthy … I’m the one who made MTLD.

    First, this is a GREAT TOOL … awesome that you’ve put it out there so that people can use lexical diversity in assessments.

    Second, I read here that some people wonder why we recommended that researchers use MTLD, vod-D (HD-D) and MAAS given that only MTLD seems to be completely independent of text length. I think there were at least three reasons for that. (Note, the below are my thoughts, and Scott Jarvis may think otherwise … indeed, see his book on the matter).

    1/ We didn’t make MTLD just to have a competition with other researchers (something like “Hey, we’re the best!”). We did a lot of work to validate MTLD, but it would be arrogant of us to assume we had covered every angle. The other measures certainly seem to do a very good job, and the other measures only become “problematic” when text length is quite varied … leading to … if the text lengths are THAT varied then is there something else of importance that the researcher(s) should be considering? (Scott Jarvis has written extensively on this.)

    2/ The three measures we recommend work in very different ways; most notably, MTLD assesses text sequentially whereas the other measures swallow the text whole. This means that if you randomize the text, MTLD would give you a completely different value whereas the others would give you the same value each time. So, the question becomes, “Is the sequence of the wording of the text a/the quintessential characteristic of the text?” Well, if for you it is, then MTLD is the only way to go because only MTLD assesses text in that way. So, bottom line, the researcher(s) have to make that call.

    3/ I’d like to see a study where certain texts were identified as having marked variation depending on the LD measure used. For example, imagine a corpus of 1000 texts where Text X was 12th most diverse by Measure A, 136th most diverse by Measure B, and 389th most diverse by Measure C. I think it would be valuable to identify such texts and work out “what’s going on!” Such analysis would undoubtedly be revealing of textual characteristics and many other things. If we only go in armed with one measure then we miss the potential to see such outcomes, which may be critical to the understanding of the research.

    Finally, when we say “voc-D OR HD-D”, we have to remember that voc-D is … really speaking … just an estimation (albeit a great one) of HD-D (see McCarthy and Jarvis 2007). voc-D has been so widely used that HD-D is unlikely to “replace” it, but … again … really speaking …if your research question/text types make you want to use voc-D then … hmmm … you probably really should be using HD-D. That said, again, because voc-D has so much history now, you’d have less problems getting published with voc-D than HD-D. So, on that score, if you’re looking for some research to do, then writing a paper called “Use HD-D, not voc-D” would possibly be useful.


    1. Text Inspector Help Team July 2, 2017 at 12:02 pm

      Many thanks – it is great to get an expert view!
      I hope that you don’t mind if we put this onto our Lexical Diversity HELP pages too so that it will be in a more accessible place for people to see?

      Thanks again

      Stephen Bax


  5. I would like to ask a question. I plan to use this tool for analyzing texts, and I was wondering if there are any validation research studies on this tool.

    Thank you so much,

    I look forward to hearing from you.

    Thank you,


    1. Text Inspector Help Team June 10, 2017 at 2:17 pm

      Hello and thank you for your query. The main research study, which lies behind the statistics in the Scorecard, was carried out by Professor Stephen Bax on data including thousands of writing, reading and listening texts. This gave the metrics which we use to calculate the statistics. At the moment it is in preparation for publication, but we aim to release some pre-publication details on our Help pages very soon. Thanks.


  6. Hi,
    I just get the idea of your web page from ELTons awards. And I want to be clear about using this measuring tool for my classes at school.
    I wonder in what ways I can use this tool for primary and secondary classes.


    1. Text Inspector Help Team May 12, 2017 at 5:31 pm

      Hello and thank you for your question. You can use Text Inspector by checking any reading or listening text you want to use in your teaching to see if the vocabulary level is correct for the class you are teaching.

      You can also use it to check the written work of your students to see the level of vocabulary they are using, and help them to use more advanced vocabulary.

      We will soon put some videos on our Help pages to give teachers more ideas.



  7. Thomas Zapounidis December 30, 2016 at 11:32 am

    Dear Textinspector,
    Congratulations once more for your software.It is a handy tools at the hands of many researchers.
    I am currently analysing some learner words from different language skill as to their EVP and BNC coverage. I have two questions-querries
    1) how is the token defined by your softare? I’m asking because running the same amount of text with other software (e.g Antconc) I get different token counts
    2) I have run some of my words with the Textinspector and found that with reference to months ”April” is given as not listed in your BNC list (while in the written BNC list that I have found from Laurence Anthony’s page it is ranked 660).
    Thank you for your time,
    Kind Regards,
    Thomas Zapounidis


    1. Text Inspector Help Team January 14, 2017 at 4:51 pm

      Hello Thomas and thank you for your question and kind words!

      Tokens and Types:
      It is true that different software tools will analyses tokens and types slightly differently. It is difficult for us to say how we compare with others, but the point is that we show you the analysis in full detail and allow you to change the number of tokens and types yourself if you want to count them differently. This is unique to Text Inspector.

      We use the BNC list from and you are right – it seems to be missing April for some strange reason (though the other months are there). We will add it in the next upgrade. Thanks for noticing it.

      To include it in your analysis, you could add it to the count manually as a Known Word by using the tool on the main page:

      a) click on Use custom known words list below the Text entry box
      b) add ‘April’
      c) run the analysis

      Your word will appear in the count as a Known Word so you can calculate its frequency.



  8. Hello,
    Thanks for making such an amazing tool.
    Is it alright if used this tool in my research project?
    And if so, how would you prefer to be cited as?

    Thank you very much.


    1. Text Inspector Help Team December 14, 2016 at 5:54 pm

      Hello, thank you for your comments, and yes please do use it in your research!

      You could cite it as
      Text Inspector (2016) Online lexis analysis tool at [Accessed 14/12/2016]

      Best wishes,
      Text Inspector Help Team


  9. Hello there. I’ve just scanned a reading text here and then I saw its lexical profile was displayed as D ..
    My question:
    1. Does this D mean “Distinguished (Above C2)”?
    2. or Does this D mean “Level 4 D = B2”?

    I am confused because I don’t know what this D means ..
    When I follow up this information to ACTFL Proficiency guidelines D means above C2
    When I follow up this information to UCL CLIE ( D means B2

    Need your explanation. Thank you.


    1. Text Inspector Help Team May 20, 2016 at 6:33 pm

      Hello – on this website we use D to mean “a level above C2”.


  10. First of all, I would like to thank you for the incredible tool the simple versions of which you provide for free.
    I intend to use textinspector in my post-graduate research to investigate lexical diversity. After having browsed your site and tried the tool a couple of times I ‘m not sure what score is the one I should be paying attention to. I mean, I ‘ve seen the number of types, tokens, type/token ratio, syllables and so one -all very interesting- but I expected to see one numerical result that expressed lexical diversity. Am I missing something?


    1. Text Inspector Help Team April 28, 2016 at 2:05 pm

      Thank you Gina. We also think that Text Inspector is incredible!

      The score you should look at is the top line on the Lexical Diversity page. This gives you two different measures of Lexical Diversity, called VocD and MTLD.

      However, both of them are approximate, and involve SAMPLING parts of your text, so if you put a text in twice you will get a slightly different result! You will also sometimes get a very different result from MTLD and VocD, owing to the different ways they sample the text. If you are doing research you should compare different measures and then evaluate the results, also looking closely at other statistics about the vocabulary in your text (e.g. in the BNC and COCA areas).

      Another useful online tool you could look at is Paul Meara’s tool for measuring D at If you put teh same text into that tool and Text Inspector you will get approximately the same measure, but again with slight differences owing to differences in calculation and sampling.

      Best wishes, Tom

      Text Inspector Help Team


      1. Thank you for your reply, your guidance and your suggestions. I found it and I also subscribed for the standard plan, so I have more info at my disposal.
        I know about the two different measurements of Voc-D and MTLD that attempt to measure lexical diversity better than type/token but could you explain why sometimes the two scores vary a lot and sometimes very little? And am I supposed to find the mean score of these two? I Probably need to study about it but your explanation would be a good start.
        Thank you.


        1. Text Inspector Help Team April 28, 2016 at 9:41 pm

          I agree that it is not clear or easy. If you look at the article here (which is McCarthy, P.M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment, Behavior Research Methods, 42(2): 381-392) you will see that they say this (in the abstract):

          “MTLD performs well with respect to all four types of validity and is, in fact, the only index not found to vary as a function of text length.”

          So if there is a difference when you use voc-D and MTLD, it might be because of text length, and it seems that MTLD might be the more reliable. They also say:

          “three of the indices—MTLD, vocd-D (or HD-D), and Maas — appear to capture unique lexical information“. (My emphasis) And also:

          “We conclude by advising researchers to consider using MTLD, vocd-D (or HD-D), and Maas in their studies, rather than any single index, noting that lexical diversity can be assessed in many ways and each approach may be informative as to the construct under investigation.”

          I agree that this is not too helpful – they are basically saying that these indices give different information about your text but they don’t say what! They then ay try to use several of them, not only one.

          I suggest that to start with, if you are comparing texts, you choose ONE of them (we prefer MTLD) and also research what that index actually means by looking in depth at the actual words in the text.


  11. Mary Grace Portelli August 21, 2015 at 2:16 pm

    I like to write my own tests for my students, but I never quite know what is a safe vocabulary profile percentage to aim for with regard to the targeted level, i.e. if i’m planning a level B2 test, what, in your opinion, would be a safe percentage of level B2 vocabulary to aim for in the input text … would you say that around 12% level B2 vocabulary in a text (+ around 1% at C1) is safe enough to ensure that the level of the text is B2.


    1. Text Inspector Help Team September 7, 2015 at 10:00 am

      Mary, your question is exactly what we are working on now. We are researching large samples of reading texts at different CEFR levels, so that in future you can put a text into TI and then it will give you an approximate CEFR level based on a number of key indicators. (At the moment you get a Lexical Profile but it is based on student writing, so it is not fully applicable to reading texts.)

      An example of our work on reading texts: we have found that the percentage of B2 vocabulary in a reading text, as measured using the EVP tool, seems to be a statistically significant indicator of your text’s overall level.

      So in the very near future TI will offer a more advanced measure of a range of indicators so you, as a teacher, can get a quick idea of whether your chosen reading text is in fact right for your students’ level.


  12. I’m currently conducting post-graduate research in corpus linguistics using statistical analysis and text mining techniques and would like to expand on this in future research. Has this corpus and corresponding tools been used in academic research in fields like applied linguistics? Are resources available for someone who might wish to use these tools in a research project?


    1. Text Inspector Help Team September 7, 2015 at 9:53 am

      David, thanks for your question. As Text Inspector is quite new it has not yet been used in many research projects. Also, since some of the measures in TI are unique (such as the EVP tool), they are only now available to researchers.

      However, research is now being conducted using TI with large banks of texts to see which measures seem to correlate with which levels of difficulty. At the moment, when you input a text of more than 100 words, you get a Text Inspector Lexical Profile, and this is based on extensive research using written texts. Similar work is under way using reading and listening texts.

      To answer your other question – we feel that TI is very well suited to large scale research projects….


  13. I find the text inspector very useful, but it is not clearly saying the level of the text. So I would like to know which indicator is teeling the exact over all level of a text, like B1.


    1. Text Inspector help team December 27, 2014 at 11:40 pm

      Thanks abdikarim

      You are right. We are developing a measure so that it will tell you more clearly what the statistics mean for each text, so it will give you a good guide about the level of your text – A1, B2 and so on.

      This needs some careful research on a large body of texts, so it will take some tIme, but when it is ready we will publish it on the site and also here.

      Thanks for your interest.


Leave a Reply

Your email address will not be published. Required fields are marked *