Newcastle Electronic Corpus of Tyneside English
Encoding format: TEI XML
The compilation and computerisation of the Newcastle Electronic Corpus of Tyneside English began in 1994 and continued until 2005.
Catherine Cookson Archive of Northumbrian Dialect
Mode of access: Online. OTA website.
Title proper taken from OTA Catalogue Form
The 1960’s recordings were gathered by Vince McNeaney during the SSRC-funded “Tyneside Linguistic Survey” (TLS) undertaken by Barbara Strang (Principal Investigator), John Pellowe and associates of Newcastle University. The corpus originally consisted of 86 loosely-structured 30-minute interviews recorded onto analog reel-to-reel tapes. Their informants were drawn from a stratified random sample of Gateshead in North-East England and were divided among various social class groupings. The TLS collected very detailed social data from its interviewees including lifestyle factors such as details of leisure activities, voting preferences, attitudes to education and parental discipline. The interviews covered a range of topics, and speakers were encouraged to talk about their life histories and their attitudes to the local dialect. At the end of the interview, they were asked whether they knew/used traditional dialect words and were also asked for native speaker judgements of constructions containing vernacular morphosyntax. The more recent of the two corpora was collected in the Tyneside area between 1991 and 1994 for the ESRC-funded “Phonological Variation and Change in Contemporary Spoken English” (PVC) project (R000234892) undertaken by Gerard Docherty, James Milroy, Lesley Milroy (Principal Investigator) and associates of the University of Newcastle. The materials comprise 18 digital audio tapes, each of roughly 60 minutes' duration. Dyads of friends or relatives were encouraged to converse freely about a wide range of topics with minimal interference from the fieldworker (Penny Oxley), and informants were again equally divided between various social class groupings. The PVC project recorded much more minimal social data. The Newcastle Electronic Corpus of Tyneside English (NECTE) amalgamates these materials into a single XML-encoded corpus and makes them available in a variety of formats: digitized audio, standard orthographic transcription, phonetic transcription, and part-of-speech tagged.
The text files may be downloaded here, but the audio files are available by request only.
Also available at: http://www.ncl.ac.uk/necte/.
Beal, J.C., K.P. Corrigan and H. Moisl (eds.) (to appear). Using Unconventional Digital Language Corpora, Vol. 1: Synchronic Corpora. Houndmills: Palgrave Macmillan.
Beal, J.C., K.P. Corrigan and H. Moisl (eds.) (to appear). Using Unconventional Digital Language Corpora, Vol. 2: Diachronic Corpora. Houndmills: Palgrave Macmillan.