F-LOB and Frown


Part-of-speech tagging for the Freiburg updates of the LOB and Brown corpora


The Freiburg updates of the well-known LOB and Brown reference corpora of British and American English (F-LOB and "Frown" for short) have been used with profit in a large number of studies on ongoing linguistic change in present-day English, both by our Freiburg-based research team and others.

"Tagged" versions of the corpora, that is versions in which each word is followed by an automatically assigned and manually post-edited part-of-speech indicator (a "tag", cf. example text) has greatly expanded the range of investigations which can be carried out on the basis of this valuable material. For example, rather than look for individual instances of phrasal verbs, it is possible to ask whether verb+particle combinations as a group have become more frequent, or whether there have been shifts in the relative frequency of nouns or verbs in individual genres or in the language as a whole. This project was undertaken jointly with Prof. Geoffrey Leech's team at Lancaster.
The F-LOB and Frown corpora consist of the same text categories as LOB and Brown (cf. table) - each corpus consists of 500 texts of about 2,000 words each, all of them written, edited, and published, which amounts to a total of about one million words per corpus.

