Parsed Corpus of Early English Correspondence (PCEEC)


Parsed Corpus of Early English Correspondence (PCEEC)


Nevalainen, Terttu; Raumolin-Brunberg, Helena; Keränen, Jukka; Nevala, Minna; Nurmi, Arja; Palander-Collin, Minna; Taylor, Ann; Pintzuk, Susan; Warner, Anthony


Use of this resource is restricted in some manner. Usually this means that it is available for non-commercial use only with prior permission of the depositor and on condition that this header is included in its entirety with any copy distributed.

Download: click to apply for permission to download as required by the licensing restrictions (this will open a form on another page)


English; English, Middle (1100-1500)

Editorial Practice

Plain text

OTA keywords

Linguistic corpora

LC keywords

Letter writing
English language--Middle English, 1100-1500
English language--Early modern, 1500-1700

  • designation: CollectionText
  • size: 425 files : ca. 214 MB
Creation Date



The Parsed Corpus of Early English Correspondence contains 4970 personal letters by 666 writers, altogether 2.2 million words of running text from the years 1410?-1681. The letters have been selected to be as socially representative of the literate social ranks of the time as possible.

In addition to the flat text version, the corpus has also been provided with part-of-speech tagging and parsing. These two versions contain the same texts as the flat text version, as well as the additional linguistic coding.

The corpus is also provided with two manuals, one outlining the corpus, the other explaining the annotation.

Nevalainen, Terttu and Helena Raumolin-Brunberg. 2003. Historical sociolinguistics. London: Longman

Nevalainen, Terttu and Helena Raumolin-Brunberg (eds). 1996. Sociolinguistics and Language History. Studies Based on The Corpus of Early English Correspondence. (Language and Computers 15). Amsterdam and Atlanta: Rodopi

The Corpus of Early English Correspondence Sampler (CEECS, identification number 2461), published in 1998, and deposited in the University of Oxford Text Archive in 2003, is a flat text version of some of the texts included in PCEEC. The full Corpus of Early English Correspondence (CEEC) was completed in 1998, and contains texts which for copyright reasons are not included in either CEECS or PCEEC, but are available in digitised form in inhouse use of the CEEC project team. The CEEC is being supplemented by an extension (CEECE, 1682-1800) and a supplement (1403-1681); these two corpora are still being compiled and in inhouse use in Helsinki.

Permanent URL