With the exception of two texts, oldenburg and soest, which currently lack lemmas, the Middle Low German component of the CHLG is lemmatised.
The annotation makes a distinction between the label ORTHO
, which dominates the orthographic form of the word as it appears in the text, and the label LEMMA
, which sits within the META
information and dominates the lemma corresponding to the ORTHO
form:
( (IP-MAT (NP-SBJ (PPER (META (CASE nom) (GENDER masc) (LEMMA hē) ← lemma (NUMBER sg) (PERSON 3)) (ORTHO He))) ← orthographic form (VVFIN (META (LEMMA hebben) ← lemma (MOOD ind) (MORPHO-CLASS weak) (NUMBER sg) (PERSON 3) (TENSE past)) (ORTHO hadde)) ← orthographic form (NP-OB1 (DIARTA (META (CASE akk) (GENDER masc-neut) (LEMMA ēn) ← lemma (NUMBER sg)) (ORTHO eyn)) ← orthographic form (NA (META (CASE akk) (GENDER masc-neut) (LEMMA dēl) ← lemma (NUMBER sg)) (ORTHO deyl)) ← orthographic form (NP-COM (NP-POS (DPOSA (META (CASE gen) (GENDER neut) (LEMMA sīn) ← lemma (NUMBER sg)) (ORTHO sines)) ← orthographic form (NA (META (CASE gen) (GENDER neut) (LEMMA hēre) ← lemma (NUMBER sg)) (ORTHO heren))) ← orthographic form (NA (META (CASE akk) (GENDER fem) (LEMMA Kraft) ← lemma (NUMBER sg)) (ORTHO kraf))))) ← orthographic form )
Where lemmas are not present (e.g. for proper names), the symbol # is used as a placeholder:
( (FRAG (PP (APPR (META (CASE dat) (LEMMA van)) (ORTHO Van)) (NP (NE (NE (META (CASE dat) (GENDER masc) (LEMMA #) ← absent lemma (NUMBER sg)) (ORTHO flosse)) (KON (META (LEMMA unde)) (ORTHO vnde)) (NE (META (CASE dat) (GENDER fem) (LEMMA #) ← absent lemma (NUMBER sg)) (ORTHO blankflosse)))))) )
The lemmas for oldenburg and soest will be added in a later version.