The Parts of Speech Tagger tool

The Parts of Speech Tagger tool analyses your text and labels each part according to the role it plays in a sentence (its morphological characteristics). This includes nouns, verbs, adjectives and so on.

The tool is based on a modified version of TreeTagger was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart.

By using it, you can better understand how a language works and therefore find it easier to teach and learn. If you’re a student of linguistics or involved in linguistics research, the tool can also help in the development and use of corpora


Why use a language tagger tool?

Tagger tools are extremely useful when it comes to both studying and using language.

They analyse the basic grammar of a text and ‘label’ it with the appropriate parts of speech.

By doing this, we can better understand how a particular language works (in our case, English) and therefore improve our learning, teaching and linguistics study of that language.

Importantly, taggers also help distinguish homonyms (words that are spelled the same) which can often pose problems for ESL students and linguists alike.

For example, ‘leaves’ (verb) and ‘leaves’ (noun) are written exactly the same but they are indeed different words with very different meanings.

The POS tagger tool helps solve these problems with just a click of a button. 


How to use the Tagger Tool?

When you analyse your text you’ll be taken first to a full summary of the analysis.

Look to the menu on the left side of the page and then click the menu item ‘Tagger’.

You will then see a full summary of the POS tagger analysis including a breakdown into tokens, types, elements and token/type ratio. 

Additionally, you will find information here regarding each of the parts of speech as analysed as well as a detailed analysis which can be exported if you’re a subscriber. 

If you believe that the tagger has not analysed a word correctly, you can go down to the section at the bottom of the page called ‘Amend Text’.

Then click on the tag, select the most accurate option and then click ‘Update’.

This will change the totals in the Summary.

We highly encourage you to correct any inaccuracies you find as this will influence the rest of your analysis.


The Text Inspector POS tool tags


In addition to the 58 tags used by the original ‘TreeTagger’ software, we’ve added a further five to help improve the accuracy of the Text Inspector POS tagger tool.


This is the list of the 63 tags used in the Text Inspector Tagger tool:

CO = coordinating conjunction
CD = cardinal number
DT = determiner
DAT = determiner, article
DTW = wh-determiner
EX = existential there
FW = foreign word
IN = preposition/subord. conj.
THAT = complementizer
JJ = adjective
JJR = adjective, comparative
JJS = adjective, superlative
LS = list marker
MD = modal
NN = noun, singular or mass
NNS = noun plural
NP = proper noun, singular
NPS = proper noun, plural
PDT = predeterminer
GE = possessive ending
PN = pronoun neutral
PP = personal pronoun
PP$ = possessive pronoun
PWH = wh-pronoun
PWS = possessive wh-pronoun
RB = adverb
RBR = adverb, comparative
RBS = adverb, superlative
QQ = particle
STOP = end punctuation
SYM = symbol
TO = to
UH = interjection
VB = verb be, base form
VBD = verb be, past
VBG = verb be, gerund/participle
VBN = verb be, past participle
VBZ = verb be, pres, 3rd p. sing
VBP = verb be, pres non-3rd p.
VD = verb do, base form
VDD = verb do, past
VDG = verb do gerund/participle
VDN = verb do, past participle
VDZ = verb do, pres, 3rd per. sing
VDP = verb do, pres, non-3rd per.
VH = verb have, base form
VHD = verb have, past
VHG = verb have, gerund/participle
VHN = verb have, past participle
VHZ = verb have, pres 3rd per. sing
VHP = verb have, pres non-3rd per.
VV = verb, base form
VVD = verb, past tense
VVG = verb, gerund/participle
VVN = verb, past participle
VVP = verb, present, non-3rd p.
VVZ = verb, present 3rd p. sing.
WRB = wh-abverb
XX = negative particle
: = general joiner
$ = currency symbol
, = comma
??? = unknown token type