Aranea

A Family of Comparable
Gigaword Web Corpora

spidernet

Undocumented version of the HNC Tagset
mapped to Araneum Universal Tagset (AUT 1.0)


HNC Tag
(MSD)
AUT Tag
(aTag)
PoS
DET.* Dt determiner/article
FN.* Nn noun
MN.* Aj adjective
(NM.*)|([_A-Z]+"NM".*) Pn pronoun
SZN.* Nm numeral
(IGE.*)|([.A-Z]+IGE.*) Vb verb
HA.* Av adverb
(PREP.*)|(NU.*) Pp preposition/postposition
(IK.*)|(KOT.*) Cj conjunction
[IM]SZ.* Ij interjection
Pt particle
ABBR.* Ab abbreviation/acronym
Sy symbol
ROMAN.* Nb number
(ELO.*)|(EKSZ.*)|(FF.*) Xx other (content word)
AUX.* Xy other other (function word)
(X.*)|(UNKNOWN.*) Yy unknown/alien/foreign
(SPUNC.*)|(WPUNCT.*) Zz punctuation
Er mapping error