The Parsed Corpus of
Middle English Poetry (PCMEP)

PCMEP Home

Welcome to the Homepage of the Parsed Corpus of Middle English Poetry.

Overview. The PCMEP is a fully parsed and annotated corpus of Middle English verse texts. It currently includes 52 Middle English poems with a total of 223602 words. Click on 'Text Information' for details on every text.

Annotation. The PCMEP is parsed according to the same guidelines as its sister corpus, the Penn-Parsed Corpus of Middle English, second edition (PPCME2) (Kroch & Taylor 2000). Thus, researchers familiar with the PPCME2 do not have to learn a new annotation scheme but can use their PPCME2 search queries directly on the PCMEP text files as well. For the PPCME2 annotation manual, click here. For an overview over minor annotation differences between the PCMEP and PPCME2, click here.

Periodization. The main goal of the corpus is to help close the substantial gap in English prose texts between c. 1250 and 1350 with available poetic records from the same period. In order to be able to assess the genre difference between prose and poetry, the corpus covers a slightly greater time span than that, namely c. 1150 to 1420. This interval corresponds to the periods established in the Helsinki Corpus M1, M2, M3. The PPCME2 makes use of the same periodization system. The PCMEP splits M1 and M2 in sub-periods. The correspondence between the different periodization systems is shown in the table below:

PCMEP	Helsinki/PPCME2
M1a (1150-1200)	M1 (1150-1250)
M1b (1200-1250)	M1 (1150-1250)
M2a (1250-1300)	M2 (1250-1350)
M2b (1300-1350)	M2 (1250-1350)
M3 (1350-1420)	M3 (1350-1420)

Contact. If you have any questions about the PCMEP, please feel free to send me a message. My e-mail address is Richard.Zimmermann@manchester.ac.uk.

Acknowledgments The PCMEP was funded by a Doc.Mobility grant from the Swiss National Science Foundation (P1GEP1_148611).

The header image shows three manuscripts containing Middle English poetry. From left to right,

the Vernon Manuscript (c. 1390), f. 105v,
Cambridge, Trinity College Manuscript R.3.2. (c. 1420), f. 1v
and the Auchinleck Manuscript (c. 1330), f. 167r.

(1)	Forms of weor+tan 'be, become' have the tag BE in the PCMEP, but VB in the PPCME2. This includes the past particple, e.g. was iwur+ten, which is BED ... BEN in the PPCME, but BED ... VAN in the PPCME2.
(2)	Emendations are indicated as comments in the PCMEP, but by the addition of $ to word forms in the PPCME2. For example, the PCMEP separates nustest (Owl and Nightingale 119) into n and ustest and adds a comment CODE noting the change, whereas the PPCME2 splits a fused form such as Noff (Ormulum 124) into $ne and $off.
(3)	The PPCME always has -PRN (parenthetical) before -SPE (direct speech), e.g. IP-MAT-PRN-SPE. The PPCME2 varies the order between -PRN and -SPE more freely, perhaps according to scope, e.g. both IP-MAT-PRN-SPE and IP-MAT-SPE-PRN exist.
(4)	Missing material in a lower clause reconstructed on a higher clause is indicated in the PCMEP through the same co-indexing mechanism used for other cases of reconstruction, for example, [clause-1 +De fur flei of is mou+te. so [clause=1 leie of brenston]] (Maregrete 494). In contrast, such cases are handled in the PPCME2 with CODE comments listing rough paraphrases of missing material but no indication of gapping in the syntactic labels, for example +tei ben tyede, as a bole (CODE {is_tied}) by a stake (Wycliffe Sermon 588).

The Parsed Corpus ofMiddle English Poetry (PCMEP)

PCMEP Home

Welcome to the Homepage of the Parsed Corpus of Middle English Poetry.

The Parsed Corpus of
Middle English Poetry (PCMEP)