Newcastle Electronic Corpus of Tyneside English

  • Newcastle Electronic Corpus of Tyneside English
  • A Linguistic ‘Time-Capsule’: The Newcastle Electronic Corpus of Tyneside English

Corrigan, Karen; Moisl, Hermann; Beal, Joan


Distributed by the University of Oxford under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Download: zip



Editorial Practice

Encoding format: TEI XML

OTA keywords

Linguistic corpora

LC keywords

Language surveys
Linguistic geography

  • designation: CollectionText
  • size: 67 files : ca. 24 MB
  • designation: CollectionSound
  • size: 55 files : ca. 3.79 GB (offline)
Creation Date


Source Description

Catherine Cookson Archive of Northumbrian Dialect

A thoroughly enlarged, improved and revised version of


The 1960’s recordings were gathered by Vince McNeaney during the SSRC-funded “Tyneside Linguistic Survey” (TLS) undertaken by Barbara Strang (Principal Investigator), John Pellowe and associates of Newcastle University. The corpus originally consisted of 86 loosely-structured 30-minute interviews recorded onto analog reel-to-reel tapes. Their informants were drawn from a stratified random sample of Gateshead in North-East England and were divided among various social class groupings. The TLS collected very detailed social data from its interviewees including lifestyle factors such as details of leisure activities, voting preferences, attitudes to education and parental discipline. The interviews covered a range of topics, and speakers were encouraged to talk about their life histories and their attitudes to the local dialect. At the end of the interview, they were asked whether they knew/used traditional dialect words and were also asked for native speaker judgements of constructions containing vernacular morphosyntax.

The more recent of the two corpora was collected in the Tyneside area between 1991 and 1994 for the ESRC-funded “Phonological Variation and Change in Contemporary Spoken English” (PVC) project (R000234892) undertaken by Gerard Docherty, James Milroy, Lesley Milroy (Principal Investigator) and associates of the University of Newcastle. The materials comprise 18 digital audio tapes, each of roughly 60 minutes' duration. Dyads of friends or relatives were encouraged to converse freely about a wide range of topics with minimal interference from the fieldworker (Penny Oxley), and informants were again equally divided between various social class groupings. The PVC project recorded much more minimal social data. The Newcastle Electronic Corpus of Tyneside English (NECTE) amalgamates these materials into a single XML-encoded corpus and makes them available in a variety of formats: digitized audio, standard orthographic transcription, phonetic transcription, and part-of-speech tagged.

The text files may be downloaded here, but the audio files are available by request only.

Also available at:

Beal, J.C., K.P. Corrigan and H. Moisl (eds.) (to appear). Using Unconventional Digital Language Corpora, Vol. 1: Synchronic Corpora. Houndmills: Palgrave Macmillan.

Beal, J.C., K.P. Corrigan and H. Moisl (eds.) (to appear). Using Unconventional Digital Language Corpora, Vol. 2: Diachronic Corpora. Houndmills: Palgrave Macmillan.

Permanent URL