Speech, Thought and Writing Presentation Corpus (STWP)


Speech, Thought and Writing Presentation Corpus (STWP)


Culpeper, Jonathon; Semino, Elena; Short, Mick; Wynne, Martin


Distributed by the University of Oxford under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Download: zip



Editorial Practice

Encoding format: plain text files with a tagging scheme devised by the project team. Tags are between angle brackets.

OTA keywords

Linguistic corpora

LC keywords

Linguistics analysis (Linguistics)

  • designation: CollectionText
  • size: 1 file: ca. 2.3 MB
Creation Date

The resource was created between 1994 and 1997

Source Description

The texts in the corpus are samples of 2000 words or less from various sources: literary texts from the University of Oxford Text Archive and scanned from books; newspaper texts scanned from printed newspapers and online sources; biography texts scanned from printed books. The original images of scanned data are no longer available.


A corpus of approximately 260,000 words of modern British narrative texts representing three text types (fiction, newpapers, biography) with detailed annotation for all forms of speech, thought and writing presentation which occur in the corpus.